Title
TimeStream: reliable stream computation in the cloud
Abstract
TimeStream is a distributed system designed specifically for low-latency continuous processing of big streaming data on a large cluster of commodity machines. The unique characteristics of this emerging application domain have led to a significantly different design from the popular MapReduce-style batch data processing. In particular, we advocate a powerful new abstraction called resilient substitution that caters to the specific needs in this new computation model to handle failure recovery and dynamic reconfiguration in response to load changes. Several real-world applications running on our prototype have been shown to scale robustly with low latency while at the same time maintaining the simple and concise declarative programming model. TimeStream handles an on-line advertising aggregation pipeline at a rate of 700,000 URLs per second with a 2-second delay, while performing sentiment analysis of Twitter data at a peak rate close to 10,000 tweets per second, with approximately 2-second delay.
Year
DOI
Venue
2013
10.1145/2465351.2465353
EuroSys
Keywords
Field
DocType
commodity machine,application domain,popular mapreduce-style batch data,peak rate close,reliable stream computation,new computation model,powerful new abstraction,twitter data,concise declarative programming model,low-latency continuous processing,2-second delay,fault tolerance,real time,cluster computing
Computer science,Sentiment analysis,Real-time computing,Fault tolerance,Application domain,Declarative programming,Latency (engineering),Control reconfiguration,Computer cluster,Distributed computing,Cloud computing
Conference
Citations 
PageRank 
References 
111
3.14
24
Authors
9
Search Limit
100111
Name
Order
Citations
PageRank
Zhengping Qian135017.04
Yong He21113.14
Chunzhi Su31193.86
Zhuojie Wu41113.14
Hongyu Zhu51113.14
Taizhi Zhang61113.14
Lidong Zhou72136147.82
Yuan Yu82955149.84
Zheng Zhang9119373.82