Title
Construction and first analysis of a corpus for the evaluation and training of microblog/twitter geoparsers
Abstract
This article presents an approach to place reference corpus building and application of the approach to a Geo-Microblog Corpus that will foster research and development in the areas of microblog/twitter geoparsing and geographic information retrieval. Our corpus currently consists of 6000 tweets with identified and georeferenced place names. 30% of the tweets contain at least one place name. The corpus is intended to support the evaluation, comparison, and training of geoparsers. We introduce our corpus building framework, which is developed to be generally applicable beyond microblogs, and explain how we use crowdsourcing and geovisual analytics technology to support the construction of relatively large corpora. We then report on the corpus building work and present an analysis of causes of disagreement between the lay persons performing place identification in our crowdsourcing approach.
Year
DOI
Venue
2014
10.1145/2675354.2675701
GIR
Keywords
Field
DocType
experimentation,human factors,microblogs,geoparsing,languages,corpus building,twitter,information search and retrieval
Toponymy,World Wide Web,Social media,Information retrieval,Computer science,Crowdsourcing,Microblogging,Georeference,Geographic information retrieval,Geoparsing,Analytics
Conference
Citations 
PageRank 
References 
7
1.30
10
Authors
6
Name
Order
Citations
PageRank
Jan Oliver Wallgrün123319.29
Frank Hardisty215917.74
Alan M. MacEachren31207104.22
Morteza Karimzadeh4305.10
Yiting Ju571.30
Scott Pezanowski617111.84