Title
Text-based twitter user geolocation prediction
Abstract
Geographical location is vital to geospatial applications like local search and event detection. In this paper, we investigate and improve on the task of text-based geolocation prediction of Twitter users. Previous studies on this topic have typically assumed that geographical references (e.g., gazetteer terms, dialectal words) in a text are indicative of its author's location. However, these references are often buried in informal, ungrammatical, and multilingual data, and are therefore non-trivial to identify and exploit. We present an integrated geolocation prediction framework and investigate what factors impact on prediction accuracy. First, we evaluate a range of feature selection methods to obtain \"location indicative words\". We then evaluate the impact of nongeotagged tweets, language, and user-declared metadata on geolocation prediction. In addition, we evaluate the impact of temporal variance on model generalisation, and discuss how users differ in terms of their geolocatability. We achieve state-of-the-art results for the text-based Twitter user geolocation task, and also provide the most extensive exploration of the task to date. Our findings provide valuable insights into the design of robust, practical text-based geolocation prediction systems.
Year
DOI
Venue
2014
10.1613/jair.4200
J. Artif. Intell. Res. (JAIR)
Field
DocType
Volume
Geospatial analysis,Metadata,Data mining,Location,Feature selection,Information retrieval,Generalization,Geolocation,Exploit,Local search (optimization),Mathematics
Journal
49
Issue
ISSN
Citations 
1
1076-9757
106
PageRank 
References 
Authors
2.91
63
3
Search Limit
100106
Name
Order
Citations
PageRank
Bo Han159329.85
Paul Cook21173.50
Timothy Baldwin345222.18