Title
Finding Non-Redundant Multi-Word Events on Twitter.
Abstract
Twitter is a pervasive technology, with hundreds of millions of users serving as sensors that provide eyewitness accounts of events on the ground. In case of popular events, these sensors start to broadcast news by tweeting to their followers, and to the world. Within minutes these tweets can attract attention and also serve as a primary information source for traditional media. Given a huge set of tweets, the key questions are: (1) How can we detect informative events in general? (2) How can we distinguish relevant events from others? In this paper we tackle these challenges with a statistical model for detecting events by spotting significant frequency deviations of the words' frequency over time. Besides single word events, our model also accounts for events composed of multiple co-occurring words, thus, providing much richer information. Our statistical process is complemented with an optimization algorithm to extract only non-redundant events, overall, providing the user with a succinct summary of the current events. We used our model to analyze 24 million geotagged tweets that have been sent in the US from April 9 to April 22, 2013 -- the time period of the Boston marathon bombing -- and we show that our approach can create multi-word events that efficiently summarize real-world events.
Year
DOI
Venue
2015
10.1145/2808797.2809390
ASONAM
Keywords
Field
DocType
nonredundant multiword events,Twitter,informative events,statistical model,optimization algorithm,geotagged tweets,Boston marathon bombing
Data mining,Broadcasting,Algorithm design,Pervasive technology,Computer science,Optimization algorithm,Statistical model,Statistical process control,Spotting
Conference
Citations 
PageRank 
References 
2
0.36
12
Authors
2
Name
Order
Citations
PageRank
Nikou Günnemann1524.51
Jürgen Pfeffer234626.57