Title
Combining Heterogeneous Data Sources for Civil Unrest Forecasting
Abstract
Detecting and forecasting civil unrest events (protests, strikes, etc.) is of key interest to social scientists and policy makers because these events can lead to significant societal and cultural changes. We analyze protest dynamics in six countries of Latin America on a daily level, from November 2012 through August 2014, using multiple data sources that capture social, political and economic contexts within which civil unrest occurs. We use logistic regression models with Lasso to select a sparse feature set from our diverse datasets, in order to predict the probability of occurrence of civil unrest events in these countries. The models contain predictors extracted from social media sites (Twitter and blogs) and news sources, in addition to volume of requests to Tor, a widely-used anonymity network. Two political event databases and country-specific exchange rates are also used. Our forecasting models are evaluated using a Gold Standard Report (GSR), which is compiled by an independent group of social scientists and experts on Latin America. The experimental results, measured by F1-scores, are in the range 0.68 to 0.95, and demonstrate the efficacy of using a multi-source approach for predicting civil unrest. Case studies illustrate the insights into unrest events that are obtained with our methods.
Year
DOI
Venue
2015
10.1145/2808797.2808847
Advances in Social Network Analysis and Mining
Keywords
Field
DocType
heterogeneous data sources,civil unrest event forecasting,civil unrest event detection,societal change,cultural change,protest dynamics analysis,Latin America,multiple data sources,social context,political context,economic context,logistic regression models,sparse feature set selection,occurrence probability prediction,social media sites,Twitter,blogs,Tor,anonymity network,political event databases,country-specific exchange rates,Gold Standard Report,GSR,F1-scores,multisource approach
Data mining,Crowds,Latin Americans,Social media,Sociology,Lasso (statistics),Emerging markets,Anonymity,Unrest,Politics
Conference
Citations 
PageRank 
References 
9
0.60
17
Authors
6
Name
Order
Citations
PageRank
Gizem Korkmaz19811.10
Jose Cadena2987.53
Chris J. Kuhlman321625.03
Achla Marathe420323.77
Anil Kumar S. Vullikanti5113598.30
Naren Ramakrishnan61913176.25