Title
Chukwa: a system for reliable large-scale log collection
Abstract
Large Internet services companies like Google, Yahoo, and Facebook use the MapReduce programming model to process log data. MapReduce is designed to work on data stored in a distributed filesystem like Hadoop's HDFS. As a result, a number of log collection systems have been built to copy data into HDFS. These systems often lack a unified approach to failure handling, with errors being handled separately by each piece of the collection, transport and processing pipeline. We argue for a unified approach, instead. We present a system, called Chukwa, that embodies this approach. Chukwa uses an end-to-end delivery model that can leverage local on-disk log files for reliability. This approach also eases integration with legacy systems. This architecture offers a choice of delivery models, making subsets of the collected data available promptly for clients that require it, while reliably storing a copy in HDFS. We demonstrate that our system works correctly on a 200-node testbed and can collect in excess of 200 MB/sec of log data. We supplement these measurements with a set of case studies describing real-world operational experience at several sites.
Year
Venue
Keywords
2010
LISA
unified approach,end-to-end delivery model,log collection system,log data,mapreduce programming model,reliable large-scale log collection,delivery model,local on-disk log file,large internet services company,legacy system,case study,scale,logging
Field
DocType
Citations 
Architecture,Programming paradigm,Computer science,Testbed,Database,Legacy system,The Internet
Conference
14
PageRank 
References 
Authors
0.93
22
2
Name
Order
Citations
PageRank
Ariel Rabkin1170473.10
Randy H. Katz2168193018.89