Title
A Comparative Analysis of Content-based Geolocation in Blogs and Tweets.
Abstract
The geolocation of online information is an essential component in any geospatial application. While most of the previous work on geolocation has focused on Twitter, in this paper we quantify and compare the performance of text-based geolocation methods on social media data drawn from both Blogger and Twitter. We introduce a novel set of location specific features that are both highly informative and easily interpretable, and show that we can achieve error rate reductions of up to 12.5% with respect to the best previously proposed geolocation features. We also show that despite posting longer text, Blogger users are significantly harder to geolocate than Twitter users. Additionally, we investigate the effect of training and testing on different media (cross-media predictions), or combining multiple social media sources (multi-media predictions). Finally, we explore the geolocability of social media in relation to three user dimensions: state, gender, and industry.
Year
Venue
DocType
2018
arXiv: Computation and Language
Journal
Volume
Citations 
PageRank 
abs/1811.07497
0
0.34
References 
Authors
0
3
Name
Order
Citations
PageRank
Konstantinos Pappas101.01
Mahmoud Azab2124.09
Rada Mihalcea36460445.54