Title
HERMEVENT: a news collection for emerging-event detection
Abstract
News portals and microblogging platforms have become people's medium of choice for breaking news and unexpected events, thanks to their ability to provide directions and useful information more timely and more effectively than official communication channels. This has caused a flourishing of research on event detection in social-media streams. However, this research is severely affected by the scarcity of publicly-available test collections, which are needed to build proper evaluation mechanisms. In this paper we introduce a new test collection for event detection, which we dub HERMEVENT. The dataset includes a large-scale dump of tweets and news articles from a list of major Italian newspapers, spanning a time interval of approximately 3 months in 2016/2017. From this dump we extracted a set of temporal graphs with different semantic and temporal granularity. To demonstrate the good quality of our data collection, we run two state-of-the-art algorithms that detect emerging events by extracting dense sub-graphs from a temporal graph. We conduct an editorial evaluation of the events discovered by the two algorithms on a set of 780 stories, achieving an accuracy of 75% in detecting real-world events. We make the text dump, the graphs and the editorial judgements freely available. We believe that this new dataset can be a really useful contribution to support research on event detection.
Year
DOI
Venue
2017
10.1145/3102254.3102262
WIMS
Field
DocType
ISBN
Entity linking,Data collection,Data mining,Social media,Scarcity,Web mining,Information retrieval,Computer science,Microblogging,Communication channel,Newspaper,Unexpected events
Conference
978-1-4503-5225-3
Citations 
PageRank 
References 
0
0.34
32
Authors
6
Name
Order
Citations
PageRank
Cristiano Di Crescenzo100.34
Giulia Gavazzi200.34
Giacomo Legnaro300.34
Elena Troccoli400.34
Ilaria Bordino562928.81
Francesco Gullo648332.63