Title
Distributed Fuzzy Rough Set for Big Data Analysis in Cloud Computing
Abstract
Fuzzy rough set based feature selection is a widely adopted technique for big data analysis. However, the high accuracy of this technique depends on all the data correlations, so that it always works in the centralized computing mode. With the increasing data volume, the centralized server, especially its computation capability and memory space, cannot afford the computing of fuzzy rough set. To enable the fuzzy rough set for big data analysis, in this paper, we propose the novel Distributed Fuzzy Rough Set (DFRS) based feature selection in cloud computing, which separates and assigns the tasks to multiple nodes for parallel computing. The key challenge is to maintain the global information on each distributed node without conserving the entire fuzzy relation matrix. We tackle this challenge by a dynamic data decomposition algorithm and a data summarization process on each distributed node. Extensive experiments based on multiple real datasets demonstrate that DFRS significantly improves the runtime and its feature selection accuracy is nearly the same as the traditional centralized computing.
Year
DOI
Venue
2019
10.1109/ICPADS47876.2019.00023
2019 IEEE 25th International Conference on Parallel and Distributed Systems (ICPADS)
Keywords
Field
DocType
Distributed feature selection, big data, fuzzy rough sets, dynamic data decomposition
Automatic summarization,Feature selection,Computer science,Centralized computing,Fuzzy rough sets,Dynamic data,Big data,Cloud computing,Distributed computing,Computation
Conference
ISSN
ISBN
Citations 
1521-9097
978-1-7281-2584-8
0
PageRank 
References 
Authors
0.34
15
5
Name
Order
Citations
PageRank
Wenhao Qu100.34
Linghe Kong277072.44
kaishun wu3105994.59
Feilong Tang443261.65
guihai chen53537317.28