Title
Assessing spatiotemporal predictability of LBSN: a case study of three Foursquare datasets
Abstract
Location-based social networks (LBSN) have provided new possibilities for researchers to gain knowledge about human spatiotemporal behavior, and to make predictions about how people might behave through space and time in the future. An important requirement of successfully utilizing LBSN in these regards is a thorough understanding of the respective datasets, including their inherent potential as well as their limitations. Specifically, when it comes to predictions, we must know what we can actually expect from the data, and how we could maximize their usefulness. Yet, this knowledge is still largely lacking from the literature. Hence, this work explores one particular aspect which is the theoretical predictability of LBSN datasets. The uncovered predictability is represented with an interval. The lower bound of the interval corresponds to the amount of regular behaviors that can easily be anticipated, and represents the correct predication rate that any algorithm should be able to achieve. The upper bound corresponds to the amount of information that is contained in the dataset, and represents the maximum correct prediction rate that cannot be exceeded by any algorithms. Three Foursquare datasets from three American cities are studied as an example. It is found that, within our investigated datasets, the lower bound of predictability of the human spatiotemporal behavior is 27%, and the upper bound is 92%. Hence, the inherent potentials of the dataset for predicting human spatiotemporal behavior are clarified, and the revealed interval allows a realistic assessment of the quality of predictions and thus of associated algorithms. Additionally, in order to provide further insight into the practical use of the dataset, the relationship between the predictability and the check-in frequencies are investigated from three different perspectives. It was found that the individual perspective provides no significant correlations between the predictability and the check-in frequency. In contrast, the same two quantities are found to be negatively correlated from temporal and spatial perspectives. Our study further indicates that the heavily frequented contexts and some extraordinary geographic features such as airports could be good starting points for effective improvements of prediction algorithms. In general, this research provides novel knowledge regarding the nature of the LBSN dataset and practical insights for a more reasonable utilization of the dataset.
Year
DOI
Venue
2018
https://doi.org/10.1007/s10707-016-0279-5
GeoInformatica
Keywords
Field
DocType
Predictability,Spatiotemporal behavior,Context,Location-based social networks,Foursquare,Citizen sensing
Data mining,Predictability,Social network,Upper and lower bounds,Prediction algorithms,Geography
Journal
Volume
Issue
ISSN
22
3
1384-6175
Citations 
PageRank 
References 
1
0.34
28
Authors
4
Name
Order
Citations
PageRank
Ming Li1121.89
René Westerholt2253.65
Hongchao Fan3177.44
alexander zipf422923.84