Title
Incremental approaches for heterogeneous feature selection in dynamic ordered data
Abstract
Feature selection can identify essential features and reduce the dimensionality of features, improving the classification ability of a learning model. In this study, we consider data with a preference-order relation, i.e., ordered data. In the big data era, ordered data contain noise and exhibit heterogeneous features (including numerical and categorical features) and dynamic characteristics (i.e., new objects are added and obsolete objects are removed with evolving time). The dominance-based neighborhood rough set (DNRS) considers the preference order relation of heterogeneous features and demonstrates fault tolerance; thus, it can be applied well to heterogeneous feature selection in ordered data. At present, DNRS-based heterogeneous feature selection methods are only used for static ordered data. For dynamic ordered data, existing heterogeneous feature selection approaches are highly time-consuming because they are required to recalculate knowledge from scratch when multiple objects vary. Motivated by this issue, we utilize a matrix-based method in this work to study incremental heterogeneous feature selection based on DNRS in dynamic ordered data. First, we define neighborhood dominance conditional entropy (NDCE) as the uncertainty measure and introduce a non-monotonic feature selection strategy based on this measure. Second, the neighborhood dominance relation matrix and its diagonal matrix are defined to calculate NDCE in matrix form. Third, the updating mechanisms of the diagonal matrix are studied when objects vary and used to update NDCE. Lastly, two incremental feature selection algorithms are proposed when multiple objects are added to or deleted from heterogeneous ordered data. Experiments are performed on public data sets. Experimental results verify that the proposed incremental algorithms are effective and efficient for updating feature subsets in dynamic heterogeneous ordered data.
Year
DOI
Venue
2020
10.1016/j.ins.2020.06.051
Information Sciences
Keywords
DocType
Volume
Heterogeneous ordered decision system,Dominance-based neighborhood rough set,Feature selection,Matrix-based incremental algorithm
Journal
541
ISSN
Citations 
PageRank 
0020-0255
5
0.38
References 
Authors
41
5
Name
Order
Citations
PageRank
Binbin Sang1376.26
Hongmei Chen273825.19
Tianrui Li33176191.76
Weihua Xu434823.88
Hong Yu51982179.13