Parallel Implementation Of Chi2 Algorithm In Mapreduce Framework - Citegraph

Paper Info

Title
Parallel Implementation Of Chi2 Algorithm In Mapreduce Framework

Abstract
The discretization of continuous attributes is an important preprocessing step for machine learning and data mining. How to efficiently process the discretization of continuous attributes of massive data has become an urgent problem to be resolved. Hadoop as a rising technique in recent years can efficiently process many applications based on massive data. This paper designs and implements a parallel Chi2-based discretization algorithm based on MapReduce model. On the premise of the discretization efficiency, experiments have been done by using different size of data sets in the different nodes. The experimental results show that the proposed algorithm has high efficiency and good scalability to process the discretization of continuous attributes of massive data.

Year	DOI	Venue
2014	10.1007/978-3-319-15554-8_83	HUMAN CENTERED COMPUTING, HCC 2014
Keywords	Field	DocType
Hadoop, MapReduce, Chi2 algorithm, Large-scale data, Discretization	Discretization,Data set,Computer science,Parallel computing,Algorithm,Scalability	Conference
Volume	ISSN	Citations
8944	0302-9743	1
PageRank	References	Authors
0.36	4	3

Authors (3 rows)

Cited by (1 rows)

References (4 rows)

Name	Order	Citations	PageRank
Yong Zhang	1	14	4.98
Jingwen Yu	2	1	0.36
Jianying Wang	3	1	0.36

1