Title
Multi-node Big Data VM Platform and Job Submission Portlet
Abstract
The present study utilizes VirtualBox virtual environment technology to develop the personal and compact size of multi-node big data VM platform with Spark and Hadoop cluster that can effectively replicate and provides an environment for developers to easily design and implement Spark and Hadoop Map/Reduce programming. By using the multi-node Hadoop VM system, developers can conduct Map/Reduce programing completely the same as that in the real multi-node Hadoop cluster. To demonstrate its capability and applicability, this study performs the benchmark by using the big data VM platform and a physical Multi-Node Hadoop Cluster. Based on the standard WordCount benchmarking, the computing time of the physical multi-node Hadoop cluster is 3.7 times faster than that of VM Hadoop cluster. The benchmark results show that the big data VM platform is an ideal platform for the portal and Map/Reduce programming, Spark programming and testing purposes, and the physical Hadoop cluster is the most appropriate for production runs. In addition, the big data VM platform contains a web portal development module designed to support applications that implement big data computing services for the engineering and science users. Such applications are inherently complex, potentially accessing data from a variety of sources and distributing applications to a variety of clients. This portal development module can act as multiple roles in many projects such as personal portals, small business portals, enterprise portals, educational portal, infrastructure portal, and other types of portals. Finally, the big data VM platform, in term of a big data development platform, is ready for users to download. The first author of this paper would like to give a demonstration for the proposed multi-node big data VM platform.
Year
DOI
Venue
2017
10.1109/ACIT-CSII-BCD.2017.72
2017 5th Intl Conf on Applied Computing and Information Technology/4th Intl Conf on Computational Science/Intelligence and Applied Informatics/2nd Intl Conf on Big Data, Cloud Computing, Data Science (ACIT-CSII-BCD)
Keywords
DocType
ISBN
Big Data,Spark,Hadoop,Computation,MapReduce,Personal Platform,portal
Conference
978-1-5386-3303-8
Citations 
PageRank 
References 
0
0.34
1
Authors
6
Name
Order
Citations
PageRank
Chien-Heng Wu131.82
Wen-Yi Chang2123.34
Whey-Fone Tsai300.34
Franco Lin400.34
Ching-Fang Lee500.34
Chao-Tung Yang61196139.50