Title
RadiusSketch: Massively Distributed Indexing of Time Series
Abstract
Performing similarity queries on hundreds of millions of time series is a challenge requiring both efficient indexing techniques and parallelization. We propose a sketch/random projection-based approach that scales nearly linearly in parallel environments, and provides high quality answers. We illustrate the performance of our approach, called RadiusSketch, on real and synthetic datasets of up to 1 Terabytes and 500 million time series. The sketch method, as we have implemented, is superior in both quality and response time compared with the state of the art approach, iSAX2+. Already, in the sequential case it improves recall and precision by a factor of two, while giving shorter response times. In a parallel environment with 32 processors, on both real and synthetic data, our parallel approach improves by a factor of up to 100 in index time construction and up to 15 in query answering time. Finally, our data structure makes use of idle computing time to improve the recall and precision yet further.
Year
DOI
Venue
2017
10.1109/DSAA.2017.49
2017 IEEE International Conference on Data Science and Advanced Analytics (DSAA)
Keywords
Field
DocType
recall,precision,parallel environment,response times,indexing techniques,similarity queries,RadiusSketch,time series,sketch method,synthetic datasets,high quality answers,sketch/random projection,parallelization,distributed indexing,idle computing time,query answering time,index time construction
Random projection,Data structure,Computer science,Precision and recall,Euclidean distance,Response time,Algorithm,Search engine indexing,Synthetic data,Sketch
Conference
ISSN
ISBN
Citations 
2472-1573
978-1-5090-5005-5
1
PageRank 
References 
Authors
0.37
20
4
Name
Order
Citations
PageRank
Djamel Edine Yagoubi171.85
Reza Akbarinia225425.77
Florent Masseglia340843.08
Dennis E. Shasha4176.79