Abstract | ||
---|---|---|
Fuzzy rough set based feature selection is a widely adopted technique for big data analysis. However, the high accuracy of this technique depends on all the data correlations, so that it always works in the centralized computing mode. With the increasing data volume, the centralized server, especially its computation capability and memory space, cannot afford the computing of fuzzy rough set. To enable the fuzzy rough set for big data analysis, in this paper, we propose the novel Distributed Fuzzy Rough Set (DFRS) based feature selection in cloud computing, which separates and assigns the tasks to multiple nodes for parallel computing. The key challenge is to maintain the global information on each distributed node without conserving the entire fuzzy relation matrix. We tackle this challenge by a dynamic data decomposition algorithm and a data summarization process on each distributed node. Extensive experiments based on multiple real datasets demonstrate that DFRS significantly improves the runtime and its feature selection accuracy is nearly the same as the traditional centralized computing. |
Year | DOI | Venue |
---|---|---|
2019 | 10.1109/ICPADS47876.2019.00023 | 2019 IEEE 25th International Conference on Parallel and Distributed Systems (ICPADS) |
Keywords | Field | DocType |
Distributed feature selection, big data, fuzzy rough sets, dynamic data decomposition | Automatic summarization,Feature selection,Computer science,Centralized computing,Fuzzy rough sets,Dynamic data,Big data,Cloud computing,Distributed computing,Computation | Conference |
ISSN | ISBN | Citations |
1521-9097 | 978-1-7281-2584-8 | 0 |
PageRank | References | Authors |
0.34 | 15 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Wenhao Qu | 1 | 0 | 0.34 |
Linghe Kong | 2 | 770 | 72.44 |
kaishun wu | 3 | 1059 | 94.59 |
Feilong Tang | 4 | 432 | 61.65 |
guihai chen | 5 | 3537 | 317.28 |