Title
A near-linear time approximation algorithm for angle-based outlier detection in high-dimensional data
Abstract
Outlier mining in d-dimensional point sets is a fundamental and well studied data mining task due to its variety of applications. Most such applications arise in high-dimensional domains. A bottleneck of existing approaches is that implicit or explicit assessments on concepts of distance or nearest neighbor are deteriorated in high-dimensional data. Following up on the work of Kriegel et al. (KDD '08), we investigate the use of angle-based outlier factor in mining high-dimensional outliers. While their algorithm runs in cubic time (with a quadratic time heuristic), we propose a novel random projection-based technique that is able to estimate the angle-based outlier factor for all data points in time near-linear in the size of the data. Also, our approach is suitable to be performed in parallel environment to achieve a parallel speedup. We introduce a theoretical analysis of the quality of approximation to guarantee the reliability of our estimation algorithm. The empirical experiments on synthetic and real world data sets demonstrate that our approach is efficient and scalable to very large high-dimensional data sets.
Year
DOI
Venue
2012
10.1145/2339530.2339669
KDD
Keywords
Field
DocType
angle-based outlier factor,large high-dimensional data set,real world data set,high-dimensional data,mining high-dimensional outlier,high-dimensional domain,data mining task,angle-based outlier detection,data point,outlier mining,near-linear time approximation algorithm,cubic time,data mining,nearest neighbor,high dimensional data,outlier detection,linear time
k-nearest neighbors algorithm,Data point,Random projection,Anomaly detection,Approximation algorithm,Data mining,Clustering high-dimensional data,Computer science,Outlier,Artificial intelligence,Time complexity,Machine learning
Conference
Citations 
PageRank 
References 
37
1.04
15
Authors
2
Name
Order
Citations
PageRank
Ninh Pham11697.68
Rasmus Pagh2134486.08