Abstract | ||
---|---|---|
This paper introduces a gold-standard corpus extracted from manually labeled tweets concerning urban issues. The main contribution is to provide a labeled tweet dataset which can be useful for building machine-learning classifiers in the urban issues domain, including geographical features. Thus, this corpus can also be useful for improving geoparsers to correctly identify place names in urban such as Points-of-Interest (POI), Streets/Roads and Districts. Our method for building the corpus includes human-volunteer quality assessment and human-driven labeling using an ad hoc web application, the Tweet Annotator. The volunteers were asked to complete a feedback survey in order to identify the main difficulties during the labeling task. In this paper, we also report the findings from a case study carried out to analyze the spatial relationships in the generated corpus for the locations which a tweet may refer to: the geocoded, the user home and the mentioned ones. |
Year | DOI | Venue |
---|---|---|
2017 | 10.1145/3019612.3019808 | SAC |
Field | DocType | Citations |
Toponymy,World Wide Web,Social media,Geocoding,Computer science,Geoparsing,Web application,Gold standard | Conference | 2 |
PageRank | References | Authors |
0.36 | 13 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Maxwell Guimarães de Oliveira | 1 | 24 | 7.15 |
Cláudio de Souza Baptista | 2 | 133 | 31.73 |
Cláudio Elízio Calazans Campelo | 3 | 22 | 5.47 |
Michela Bertolotto | 4 | 863 | 91.77 |