Title
From Big To Smart Data: Iterative Ensemble Filter For Noise Filtering In Big Data Classification
Abstract
The quality of the data is directly related to the quality of the models drawn from that data. For that reason, many research is devoted to improve the quality of the data and to amend errors that it may contain. One of the most common problems is the presence of noise in classification tasks, where noise refers to the incorrect labeling of training instances. This problem is very disruptive, as it changes the decision boundaries of the problem. Big Data problems pose a new challenge in terms of quality data due to the massive and unsupervised accumulation of data. This Big Data scenario also brings new problems to classic data preprocessing algorithms, as they are not prepared for working with such amounts of data, and these algorithms are key to move from Big to Smart Data. In this paper, an iterative ensemble filter for removing noisy instances in Big Data scenarios is proposed. Experiments carried out in six Big Data datasets have shown that our noise filter outperforms the current state-of-the-art noise filter in Big Data domains. It has also proved to be an effective solution for transforming raw Big Data into Smart Data.
Year
DOI
Venue
2019
10.1002/int.22193
INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS
Keywords
Field
DocType
Big Data, class noise, classification, ensemble, Smart Data
Data mining,Filter (signal processing),Artificial intelligence,Smart data,Big data,Machine learning,Mathematics
Journal
Volume
Issue
ISSN
34
12
0884-8173
Citations 
PageRank 
References 
1
0.35
0
Authors
5
Name
Order
Citations
PageRank
Diego García-Gil1192.69
Francisco Luque‐Sánchez210.35
Julian Luengo3241877.15
Salvador García44151118.45
Francisco Herrera5273911168.49