Title
MapReduce-Based data stream processing over large history data
Abstract
With the development of Internet of Things applications based on sensor data, how to process high speed data stream over large scale history data brings a new challenge. This paper proposes a new programming model RTMR, which improves the real-time capability of traditional batch processing based MapReduce by preprocessing and caching, along with pipelining and localizing. Furthermore, to adapt the topologies to application characteristics and cluster environments, a model analysis based RTMR cluster constructing method is proposed. The benchmark built on the urban vehicle monitoring system shows RTMR can provide the real-time capability and scalability for data stream processing over large scale data.
Year
DOI
Venue
2012
10.1007/978-3-642-34321-6_57
ICSOC
Keywords
Field
DocType
model analysis,mapreduce-based data stream processing,large scale data,large history data,real-time capability,data stream processing,new challenge,cluster environment,sensor data,large scale history data,high speed data stream,rtmr cluster
Data mining,Pipeline (computing),Data stream mining,Programming paradigm,Computer science,Data stream,Network topology,Preprocessor,Batch processing,Scalability
Conference
Citations 
PageRank 
References 
4
0.43
9
Authors
4
Name
Order
Citations
PageRank
Kaiyuan Qi1132.47
Zhuofeng Zhao26615.46
Jun Fang3193.26
Yanbo Han450059.74