A semi-parallel framework for greedy information-theoretic feature selection. - Citegraph

Paper Info

Title
A semi-parallel framework for greedy information-theoretic feature selection.

Abstract
Feature selection (FS) is a well-studied area that avoids issues related the curse of dimensionality and overfitting. FS is a preprocessing procedure that identifies the feature subset that is both relevant and non-redundant. Although FS has been driven by the exploration of “big data” and the development of high-performance computing, the implementation of scalable information-theoretic FS remains an under-explored topic. In this contribution, we revisit the greedy optimization procedure of information-theoretic filter FS and propose a semi-parallel optimizing paradigm that can provide an equivalent feature set as the greedy FS algorithms in a fraction of the time. We focus on greedy selection algorithms due to their larger computational complexity associated with a rapidly growing number of features. Our framework is benchmarked against twelve datasets, including one extremely large dataset that has more than a million features, and we show our framework can significantly speed up the process of FS while selecting nearly the same features as the state-of-the-art information-theoretic FS methods.

Year	DOI	Venue
2019	10.1016/j.ins.2019.03.075	Information Sciences
Keywords	Field	DocType
Feature selection,Information theory,Parallel computing	Feature selection,Curse of dimensionality,Preprocessor,Artificial intelligence,Overfitting,Big data,Mathematics,Machine learning,Computational complexity theory,Speedup,Scalability	Journal
Volume	ISSN	Citations
492	0020-0255	1
PageRank	References	Authors
0.38	0	2

Authors (2 rows)

Cited by (1 rows)

References (0 rows)

Name	Order	Citations	PageRank
Heng Liu	1	153	27.10
Gregory Ditzler	2	214	16.55

1