Abstract | ||
---|---|---|
Among many Big Data applications are those that deal with data streams. A data stream is a sequence of data points with timestamps that possesses the properties of transiency, infiniteness, uncertainty, concept drift, and multi-dimensionality. In this paper we propose an outlier detection technique called Orion that addresses all the characteristics of data streams. Orion looks for a projected dimension of multi-dimensional data points with the help of an evolutionary algorithm, and identifies a data point as an outlier if it resides in a low-density region in that dimension. Experiments comparing Orion with existing techniques using both real and synthetic datasets show that Orion achieves an average of 7X the precision, 5X the recall, and a competitive execution time compared to existing techniques. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1109/BigData.2016.7840642 | 2016 IEEE International Conference on Big Data (Big Data) |
Keywords | Field | DocType |
Data Streams,Outlier Detection,Data Mining | Data point,Data mining,Anomaly detection,Data stream mining,Evolutionary algorithm,Computer science,Data stream,Outlier,Concept drift,Artificial intelligence,Big data,Machine learning | Conference |
ISBN | Citations | PageRank |
978-1-4673-9006-4 | 0 | 0.34 |
References | Authors | |
16 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Shiblee Sadik | 1 | 13 | 1.58 |
Le Gruenwald | 2 | 1241 | 131.12 |
Eleazar Leal | 3 | 7 | 6.25 |