Title
Analysis farm: A cloud-based scalable aggregation and query platform for network log analysis
Abstract
Network monitoring data provides insights into the network operation status. With increasingly sophisticated ways of probing, sampling and recording network activities, the huge amount of monitoring data brings both an opportunity and a challenge for network data analysis. We aim to build a scalable platform, named Analysis Farm, for analyzing network logs. Analysis Farm's targets include fast log aggregation and agile log query. To achieve these goals, storage scalability, computation scalability and query agility should be addressed. The cloud computing and NoSQL technologies meet our needs by providing manageable on-demand hardware resources and novel data storage models. We choose OpenStack, an open-source cloud tool set, for resource provisioning, and MongoDB, a RDBMS-like document-oriented NoSQL system, for log storage and analysis. By combining scalability at both OpenStack and MongoDB, we build Analysis Farm capable of storage scale-out, computation scale-out and agile query. The Analysis Farm prototype in use, consisting of 10 MongoDB servers, aggregates about 3 million log records in a 10-minute interval and handle ad hoc query effectively in the log database accumulated with more than 400 million records per day. In this paper, we describe Analysis Farm's background, targets, architecture and some experimental results. We believe Analysis Farm will benefit those who work on big-log-style data analysis.
Year
DOI
Venue
2011
10.1109/CSC.2011.6138547
CDC
Keywords
DocType
ISBN
openstack,agile query,million log record,network operation status,network log,log database,storage scalability,network log analysis,ad hoc query,data storage models,analysis farm prototype,network logs,storage management,manageable ondemand hardware resources,query platform,data loggers,log records,data analysis,nosql,network monitoring data,cloud computing,rdbms-like document-oriented nosql system,log storage,mongodb servers,cloud-based scalable aggregation,log analysis,nosql technology,big-log-style data analysis,open-source cloud tool set,analysis farm,data monitoring,agile log query,fast log aggregation,big data,computation scalability,resource provisioning,query agility,storage scale-out,computation scale-out,query processing,network data analysis,sql,hardware,indexes,engines,indexation,scalability,data storage,network monitoring,servers
Conference
978-1-4577-1636-2
Citations 
PageRank 
References 
3
0.43
0
Authors
5
Name
Order
Citations
PageRank
Jianwen Wei130.77
Yusu Zhao261.53
Kaida Jiang330.77
Rui Xie441.82
Yaohui Jin514329.65