Title
A gold-standard social media corpus for urban issues.
Abstract
This paper introduces a gold-standard corpus extracted from manually labeled tweets concerning urban issues. The main contribution is to provide a labeled tweet dataset which can be useful for building machine-learning classifiers in the urban issues domain, including geographical features. Thus, this corpus can also be useful for improving geoparsers to correctly identify place names in urban such as Points-of-Interest (POI), Streets/Roads and Districts. Our method for building the corpus includes human-volunteer quality assessment and human-driven labeling using an ad hoc web application, the Tweet Annotator. The volunteers were asked to complete a feedback survey in order to identify the main difficulties during the labeling task. In this paper, we also report the findings from a case study carried out to analyze the spatial relationships in the generated corpus for the locations which a tweet may refer to: the geocoded, the user home and the mentioned ones.
Year
DOI
Venue
2017
10.1145/3019612.3019808
SAC
Field
DocType
Citations 
Toponymy,World Wide Web,Social media,Geocoding,Computer science,Geoparsing,Web application,Gold standard
Conference
2
PageRank 
References 
Authors
0.36
13
4