Title
Alleviation of Disk I/O Contention in Virtualized Settings for Data-Intensive Computing
Abstract
Steady growth in storage and processing capabilities has led to the accumulation of large-scale datasets that contain valuable insight into the interactions of complex systems, long-and short-term trends, and real-world phenomena. Converged infrastructure, operating on cloud deployments and private clusters, has emerged as an energy-efficient and cost-effective means of coping with these computing demands. However, increased collocation of storage and processing activities often leads to greater contention for resources in high-use situations. This issue is particularly pronounced when running distributed computations (such as MapReduce applications), because overall execution times are dependent on the completion time of the slowest task(s). In this study, we propose a framework that makes opinionated disk scheduling decisions to ensure high throughput for tasks that use I/O resources conservatively, while still maintaining the average performance of long-running batch processing operations. Our solution does not require modification of client applications or virtual machines, and we illustrate its efficacy on a cluster of 1,200 VMs with a variety of datasets that span over 1 Petabyte of information, in situations with high disk interference, our algorithm resulted in a 20% improvement in MapReduce completion times.
Year
DOI
Venue
2015
10.1109/BDC.2015.32
2015 IEEE/ACM 2nd International Symposium on Big Data Computing (BDC)
Keywords
DocType
Citations 
Data-intensive computing,I/O interference,big data,distributed I/O performance
Conference
1
PageRank 
References 
Authors
0.51
11
3
Name
Order
Citations
PageRank
Matthew Malensek19310.44
Sangmi Lee Pallickara217024.46
Shrideep Pallickara383792.72