Title
Gene discretization based on EM clustering and adaptive sequential forward gene selection for molecular classification.
Abstract
Boost gene discrimination capability by feature discretization with EM clustering.Explore subsets of informative genes by an adaptive sequential forward search algorithm.Cancer classification based solely on the discretized gene expression monitoring.Predict distinction between multiple subclasses without previous biological knowledge. The mismatch in gene dimension as opposed to sample dimension poses a great challenge for many modelling problems in bioinformatics. Feature selection in immense quantities of high-dimensional data for molecular classification renews the tasks to the modern data mining techniques. The advent of microarray datasets pushed research in bioinformatics to a new boundary in the last decade. Many bioinformatics applications necessiate feature selection or dimensionality reduction techniques for identifying informative genes or selecting subset of genes with discrimination power. Here, gene discretization based on EM clustering for complexity simplification and better discrimination capability is employed. Then, an adaptive sequential forward search algorithm for the exploration of distinct subsets of genes with discrimination power is proposed. By monitoring the information gain acquired from a collection of selected features, we are able to predict distinction between multiple subclasses without previous knowledge of these subclasses. Experimental results demonstrate the feasibility of cancer classification based solely on the discretized gene expression monitoring, completely independent of previous biological knowledge.
Year
DOI
Venue
2016
10.1016/j.asoc.2016.07.015
Appl. Soft Comput.
Keywords
Field
DocType
Gene discretization,EM clustering,Sequential forward selection,Molecular classification
Discretization,Dimensionality reduction,Search algorithm,Gene,Feature selection,Artificial intelligence,Gene selection,Mathematical optimization,Pattern recognition,Expectation–maximization algorithm,Information gain,Machine learning,Mathematics
Journal
Volume
Issue
ISSN
48
C
1568-4946
Citations 
PageRank 
References 
3
0.36
19
Authors
1
Name
Order
Citations
PageRank
Hung-Yi Lin1398.74