Title
Topology-aware Heuristic Data Allocation Algorithm for Big Data Infrastructure
Abstract
We propose a novel optimal data placement technique considering not only the data locality but also the global data access cost to improve the performance of MapReduce in cloud data centers. We first conducted analytical and experimental study to identify the performance issues of MapReduce in data center and show that MapReduce tasks which are involved in unexpected remote data access take much more communication cost and execution time, and could significant deteriorate the all over performance. To solve optimal data placement problem, we propose a topology-aware heuristic Algorithm by firstly constructing a replica-equalized structure for abstract tree structure, and then building replica-similarity structure for detail tree construction. The experimental results demonstrated that our optimal data placement approach can minimize global data access cost effectively with low communication cost and less execution time, by reducing the unexpected remote data access.
Year
DOI
Venue
2015
10.1109/BigDataService.2015.10
BigDataService
Keywords
Field
DocType
cloud data center, MapReduce, optimal data allocation, topology-aware, heuristic algorithm
Data mining,Heuristic,Computer science,Heuristic (computer science),Network topology,Tree structure,Data access,Data center,Big data,Cloud computing,Distributed computing
Conference
Citations 
PageRank 
References 
0
0.34
13
Authors
4
Name
Order
Citations
PageRank
Wuhui Chen130734.07
Banage T. G. S. Kumara2629.65
Incheon Paik324138.80
Zhenni Li49914.48