Title
MTopS: scalable processing of continuous top-k multi-query workloads
Abstract
A continuous top-k query retrieves the k most preferred objects in a data stream according to a given preference function. These queries are important for a broad spectrum of applications ranging from web-based advertising to financial analysis. In various streaming applications, a large number of such continuous top-k queries need to be executed simultaneously against a common popular input stream. To efficiently handle such top-k query workload, we present a comprehensive framework, called MTopS.Within this MTopS framework, several computational components work collaboratively to first analyze the commonalities across the workload; organize the workload for maximized sharing opportunities; execute the workload queries simultaneously in a shared manner; and output query results whenever any input query requires. In particular, MTopS supports two proposed algorithms, MTopBand and MTopList, which both incrementally maintain the top-k objects over time for multiple queries. As the foundation, we first identify the minimal object set from the data stream that is both necessary and sufficient for accurately answering all top-k queries in the workload. Then, the MTopBand algorithm is presented to incrementally maintain such minimum object set and eliminate the need for any recomputation from scratch. To further optimize MTop-Band, we design the second algorithm, MTopList which organizes the progressive top-k results of workload queries in a compact structure. MTopList is shown to be memory optimal and also more efficient in terms of CPU time usage than MTopBand. Our experimental study, using real data streams from domains of stock trades and moving object monitoring, demonstrates that both the efficiency and scalability of our proposed techniques are clearly superior to the state-of-the-art solutions.
Year
DOI
Venue
2011
10.1145/2063576.2063736
CIKM
Keywords
Field
DocType
top-k query,multiple query,scalable processing,top-k query workload,top-k object,input query,continuous top-k query,continuous top-k multi-query workloads,workload query,output query result,progressive top-k result,data stream,spectrum,query optimization,streams,financial analysis
Query optimization,Data mining,Data stream mining,Information retrieval,Computer science,Data stream,CPU time,Workload,Ranging,Scalability
Conference
Citations 
PageRank 
References 
1
0.35
16
Authors
4
Name
Order
Citations
PageRank
Avani Shastri1170.96
Di Yang21529.72
Elke A. Rundensteiner34076700.65
Matthew O. Ward41757189.48