Title
Second-Order Destination Inference using Semi-Supervised Self-Training for Entry-Only Passenger Data.
Abstract
Automated data collection in urban transportation systems produces a large volume of passenger data. However, quite a few of the data are still incomplete, limiting the insight into passenger mobility. The unavailability of destination information in entry-only passenger data is a very common issue. Traditional approaches for estimating passenger destinations rely on heuristics that can recover only some of the missing destinations. To deal with the remaining incomplete data, this paper, for the first time, proposes a second-order inference methodology to leverage semi-supervised self-training to infer the missing destinations. The methodology involves the design of a base learner to predict the missing destinations based on the statistics of a selected similarity-based set, and the design of a selection strategy to select new data with high prediction confidence to update the training set. To further improve the inference, we incorporate personal history priors to modify the base learner. We evaluate our designs using two data sources: a real-data inspired traffic-passenger behavior simulation in the city of Porto, Portugal, and the real bus Automated Fare Collection (AFC) data collected from the same city. The experimental results show that compared to baseline methods that do not use self-training, our approach significantly improves the inference performance and achieves notably high accuracies.
Year
Venue
Field
2017
BDCAT
Data mining,Data collection,Semi-supervised learning,Leverage (finance),Inference,Computer science,Unavailability,Heuristics,Artificial intelligence,Prior probability,Machine learning,Destinations
DocType
Citations 
PageRank 
Conference
0
0.34
References 
Authors
9
3
Name
Order
Citations
PageRank
Rongye Shi161.49
Peter Steenkiste25104518.46
Manuela Veloso38563882.50