Title
Are call detail records biased for sampling human mobility?
Abstract
Call detail records (CDRs) have recently been used in studying different aspects of human mobility. While CDRs provide a means of sampling user locations at large population scales, they may not sample all locations proportionate to the visitation frequency of a user, owing to sparsity in time and space of voice-calls, thereby introducing a bias. Also, as the rate of sampling is inherently dependent on the calling frequencies of an individual, high voice-call activity users are often chosen for conducting a meaningful study. Such a selection process can, inadvertently, lead to a biased view as high frequency callers may not always be representative of an entire population. With the advent of 3G technology and wide adoption of smart-phones, cellular devices have become versatile end-hosts. As the data accessed on these devices does not always require human initiation, it affords us with an unprecedented opportunity to validate the utility of CDRs for studying human mobility. In this work, we investigate various metrics for human mobility studied in literature for over a million cellular users in the San Francisco bay-area, for over a month. Our findings reveal that although the voice-call process does well to sample significant locations, such as home and work, it may in some cases incur biases in capturing the overall spatio-temporal characteristics of individual human mobility. Additionally, we motivate an "artificially" imposed sampling process, vis-a-vis the voice-call process with the same average intensity. We observe that in many cases such an imposed sampling process yields better performance results based on the usual metrics like entropies and marginal distributions used often in literature.
Year
DOI
Venue
2012
10.1145/2412096.2412101
Mobile Computing and Communications Review
Keywords
Field
DocType
selection process,sampling process yield,cellular device,sampling process,individual human mobility,high voice-call activity user,human initiation,voice-call process,sampling user location,call detail record,human mobility
Population,Sampling process,Data mining,Name resolution,Computer science,Sampling (statistics),Marginal distribution
Journal
Volume
Issue
ISSN
16
3
1559-1662
Citations 
PageRank 
References 
35
1.52
10
Authors
4
Name
Order
Citations
PageRank
Gyan Ranjan1727.04
Hui Zang2105277.25
Zhi-Li Zhang34063317.10
Jean Bolot420310.69