Title
In pursuit of outliers in multi-dimensional data streams
Abstract
Among many Big Data applications are those that deal with data streams. A data stream is a sequence of data points with timestamps that possesses the properties of transiency, infiniteness, uncertainty, concept drift, and multi-dimensionality. In this paper we propose an outlier detection technique called Orion that addresses all the characteristics of data streams. Orion looks for a projected dimension of multi-dimensional data points with the help of an evolutionary algorithm, and identifies a data point as an outlier if it resides in a low-density region in that dimension. Experiments comparing Orion with existing techniques using both real and synthetic datasets show that Orion achieves an average of 7X the precision, 5X the recall, and a competitive execution time compared to existing techniques.
Year
DOI
Venue
2016
10.1109/BigData.2016.7840642
2016 IEEE International Conference on Big Data (Big Data)
Keywords
Field
DocType
Data Streams,Outlier Detection,Data Mining
Data point,Data mining,Anomaly detection,Data stream mining,Evolutionary algorithm,Computer science,Data stream,Outlier,Concept drift,Artificial intelligence,Big data,Machine learning
Conference
ISBN
Citations 
PageRank 
978-1-4673-9006-4
0
0.34
References 
Authors
16
3
Name
Order
Citations
PageRank
Shiblee Sadik1131.58
Le Gruenwald21241131.12
Eleazar Leal376.25