Title
Hot Spot Analysis over Big Trajectory Data
Abstract
Hot spot analysis is the problem of identifying statistically significant spatial clusters from an underlying data set. In this paper, we study the problem of hot spot analysis for massive trajectory data of moving objects, which has many real-life applications in different domains, especially in the analysis of vast repositories of historical traces of spatio-temporal data (cars, vessels, aircrafts). In order to identify hot spots, we propose an approach that relies on the Getis-Ord statistic, which has been used successfully in the past for point data. Since trajectory data is more than just a collection of individual points, we formulate the problem of trajectory hot spot analysis, using the Getis-Ord statistic. We propose a parallel and scalable algorithm for this problem, called THS, which provides an exact solution and can operate on vast-sized data sets. Moreover, we introduce an approximate algorithm (aTHS) that avoids exhaustive computation and trades-off accuracy for efficiency in a controlled manner. In essence, we provide a method that quantifies the maximum induced error in the approximation, in relation with the achieved computational savings. We develop our algorithms in Apache Spark and demonstrate the scalability and efficiency of our approach using a large, historical, real-life trajectory data set of vessels sailing in the Eastern Mediterranean for a period of three years.
Year
DOI
Venue
2018
10.1109/BigData.2018.8622376
2018 IEEE International Conference on Big Data (Big Data)
Keywords
Field
DocType
Hot spot analysis,trajectory data,parallel processing,MapReduce,Apache Spark
Hot spot (veterinary medicine),Data mining,Data set,Spark (mathematics),Statistic,Computer science,Hotspot (geology),Trajectory,Computation,Scalability
Conference
ISSN
ISBN
Citations 
2639-1589
978-1-5386-5036-3
0
PageRank 
References 
Authors
0.34
0
5