A gold-standard social media corpus for urban issues. - Citegraph

Paper Info

Title
A gold-standard social media corpus for urban issues.

Abstract
This paper introduces a gold-standard corpus extracted from manually labeled tweets concerning urban issues. The main contribution is to provide a labeled tweet dataset which can be useful for building machine-learning classifiers in the urban issues domain, including geographical features. Thus, this corpus can also be useful for improving geoparsers to correctly identify place names in urban such as Points-of-Interest (POI), Streets/Roads and Districts. Our method for building the corpus includes human-volunteer quality assessment and human-driven labeling using an ad hoc web application, the Tweet Annotator. The volunteers were asked to complete a feedback survey in order to identify the main difficulties during the labeling task. In this paper, we also report the findings from a case study carried out to analyze the spatial relationships in the generated corpus for the locations which a tweet may refer to: the geocoded, the user home and the mentioned ones.

Year	DOI	Venue
2017	10.1145/3019612.3019808	SAC
Field	DocType	Citations
Toponymy,World Wide Web,Social media,Geocoding,Computer science,Geoparsing,Web application,Gold standard	Conference	2
PageRank	References	Authors
0.36	13	4

Authors (4 rows)

Cited by (2 rows)

References (13 rows)

Name	Order	Citations	PageRank
Maxwell Guimarães de Oliveira	1	24	7.15
Cláudio de Souza Baptista	2	133	31.73
Cláudio Elízio Calazans Campelo	3	22	5.47
Michela Bertolotto	4	863	91.77

1