1 Introduction

This project analyzes event data from the Phoenix Data Project

1.1 Getting Started

Load the necessary packages.

library(plyr)
library(yaml)
library(tidyverse)

1.2 Loading Dataset

The phoenix.R script provides a function that simplifies downloading and loading the dataset. We set the start date to the beginning of the year and load the dataset. The phoenix_load() function downloads the dataset if necessary.

source('R/phoenix.R')

config <- yaml.load_file("config.yml")
events <- phoenix_load(config, start_date = "2017-01-01")

Let’s see what’s in the dataset

str(events)
## Classes 'tbl_df', 'tbl' and 'data.frame':    267382 obs. of  26 variables:
##  $ EventID             : chr  "2867898_v1.3.0" "2867899_v1.3.0" "2867900_v1.3.0" "2867901_v1.3.0" ...
##  $ Date                : Date, format: "2017-01-02" "2016-12-23" ...
##  $ Year                : int  2017 2016 2016 2017 2017 2017 2017 2017 2017 2017 ...
##  $ Month               : int  1 12 12 1 1 1 1 1 1 1 ...
##  $ Day                 : int  2 23 29 2 2 2 2 1 1 2 ...
##  $ SourceActorFull     : chr  "COG" "MNCCHE" "IRN" "DEU" ...
##  $ SourceActorEntity   : chr  "COG" "MNC" "IRN" "DEU" ...
##  $ SourceActorRole     : chr  "" "" "" "" ...
##  $ SourceActorAttribute: chr  "" "CHE" "" "" ...
##  $ TargetActorFull     : chr  "MED" "USAGOV" "SYR" "GBR" ...
##  $ TargetActorEntity   : chr  "" "USA" "SYR" "GBR" ...
##  $ TargetActorRole     : chr  "" "GOV" "" "" ...
##  $ TargetActorAttribute: chr  "" "" "" "" ...
##  $ EventCode           : chr  "010" "010" "010" "042" ...
##  $ EventRootCode       : chr  "01" "01" "01" "04" ...
##  $ PentaClass          : chr  "0" "0" "0" "1" ...
##  $ GoldsteinScore      : num  0 0 0 1.9 -4.4 1.9 0 1 1.9 0 ...
##  $ Issues              : chr  "" "" "" "" ...
##  $ Lat                 : num  NA 39.8 55.8 51.5 39.9 ...
##  $ Lon                 : num  NA -98.5 37.616 -0.126 116.397 ...
##  $ LocationName        : chr  "" "United States" "Moscow" "London" ...
##  $ StateName           : chr  "" "" "Moskva" "England" ...
##  $ CountryCode         : chr  "" "USA" "RUS" "GBR" ...
##  $ SentenceID          : chr  "586a9d7beaae1f0001eec49c_1" "5869ac06dc0402000134e7ef_0" "5869ac72d57de600015a7951_1" "586a3e29172dca00014c31b9_2;586a2f7fe580470001f1b3d6_2;586a36dfc6faa00001190301_3" ...
##  $ URLs                : chr  "http://www.nation.co.ke/news/africa/DR-Congo-set-for-talks-on-implementing-crisis-deal/1066-3504894-k0ud9b/index.html" "http://www.thelocal.ch/20161223/swiss-bank-to-pay-billions-to-settle-securities-disputes" "http://www.jpost.com/Middle-East/Syrian-army-says-countrywide-ceasefire-to-start-midnight-Thursday-476890" "https://in.news.yahoo.com/china-launches-first-freight-train-london-113004033.html;http://www.shanghaidaily.com"| __truncated__ ...
##  $ NewsSources         : chr  "kenya_nation" "local_switzerland" "jpost_me" "yahoo_india;shanghai_national;india_mint_econpol" ...

Show selected columns

events %>%
  select(Date, SourceActorFull, TargetActorFull, EventCode, LocationName) %>%
  head()
## # A tibble: 6 × 5
##         Date SourceActorFull TargetActorFull EventCode  LocationName
##       <date>           <chr>           <chr>     <chr>         <chr>
## 1 2017-01-02             COG             MED       010              
## 2 2016-12-23          MNCCHE          USAGOV       010 United States
## 3 2016-12-29             IRN             SYR       010        Moscow
## 4 2017-01-02             DEU             GBR       042        London
## 5 2017-01-02             CHN             HKG       130       Beijing
## 6 2017-01-02       USAELIGOV          USAGOV       042

Last Updated: May 08, 2017 2:26 AM