Hadoop framework: impact of data organization on performance. - Citegraph

Paper Info

Title
Hadoop framework: impact of data organization on performance.

Abstract
Hadoop, based on the popular MapReduce framework, is an open-source distributed computing framework that has been gaining much popularity and usage. It aims to allow programmers to focus on building applications that deals with processing large amount of data, without having to handle other issues when performing parallel computations. However, tuning the performance of Hadoop applications is not an easy task due to the level of abstraction of the framework. In this paper, we present three case studies and some of the challenges and issues that are to be considered in performance tuning when running applications in Hadoop. The focus is mainly on the impact of input data on Hadoop's performance and how they can be tuned. Copyright (c) 2011 John Wiley & Sons, Ltd.

Year	DOI	Venue
2013	10.1002/spe.1082	SOFTWARE-PRACTICE & EXPERIENCE
Keywords	DocType	Volume
mapreduce,hadoop,performance tuning,distributed computing	Journal	43
Issue	ISSN	Citations
SP11	0038-0644	4
PageRank	References	Authors
0.39	3	9

Authors (9 rows)

Cited by (4 rows)

References (3 rows)

Name	Order	Citations	PageRank
Yu Shyang Tan	1	70	4.58
Jiaqi Tan	2	412	25.57
Eng Siong Chng	3	970	106.33
Bu-Sung Lee	4	2119	140.18
Jiaming Li	5	25	3.80
Susumu Date	6	133	28.14
Hui Ping Chak	7	4	0.39
Xiong Xiao	8	281	34.97
Atsushi Narishige	9	10	1.33

1