Title
An Ensemble Feature Selection Method Based on Deep Forest for Microbiome-Wide Association Studies
Abstract
With the rapid advancement of DNA sequencing, metagenomics and metatranscriptomics have made great progress, which deepen our understanding on the human microbiome and its impact on human health and diseases. The microbiome, which is characterized by small samples, high dimensions and complicated relationships with hosts, refers to the species, genes and genomes of the microbiota, as well as the products of the microbiota and the host environment. In fact, many machine learning methods have been used to conduct Microbiome-Wide Association Studies which can link the microbiome with the phenotypes, such as the status of human health and diseases. However, existing methods such as Support Vector Machines (SVMs) have some limitations on deep representation learning with deep architectures which can promote the reuse of features and potentially lead to progressively more abstract features at higher layers of representations. Recently, Deep Neural Networks (DNNs), a kind of deep learning models, are widely used for metagenomic data analysis and can perform well on representation learning. But they are considered as a black box and sufferring from criticisms due to theirs lacking of interpretability. Thus, it is interesting to explore other deep learning models for metagenomic data analysis. In this work, we introduce a deep learning model called Deep Forest to study the microbiome associations and we also present an ensemble method for feature selection. Experimental results show that Deep Forest outperforms the traditional machine learning methods. In addition, compared to DNNs, Deep Forest has better interpretability and less hyperparameters.
Year
DOI
Venue
2018
10.1109/BIBM.2018.8621461
2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
Keywords
Field
DocType
ensemble feature selection method,human microbiome,human health,diseases,genes,genomes,microbiota,deep learning model,metagenomic data analysis,representation learning,deep forest,support vector machines,deep neural networks,machine learning
Interpretability,Feature selection,Computer science,Support vector machine,Microbiome,Metagenomics,Artificial intelligence,Deep learning,Feature learning,Machine learning,Human microbiome
Conference
ISSN
ISBN
Citations 
2156-1125
978-1-5386-5489-7
0
PageRank 
References 
Authors
0.34
0
7
Name
Order
Citations
PageRank
Qiang Zhu101.01
Min Pan201.69
Lei Liu358864.83
Bojing Li401.01
Tingting He5149.19
Xingpeng Jiang63420.30
Xiaohua Hu72819314.15