Title | ||
---|---|---|
Model selection for metabolomics: predicting diagnosis of coronary artery disease using automated machine learning (AutoML). |
Abstract | ||
---|---|---|
Motivation: Selecting the optimal machine learning (ML) model for a given dataset is often challenging. Automated ML (AutoML) has emerged as a powerful tool for enabling the automatic selection of ML methods and parameter settings for the prediction of biomedical endpoints. Here, we apply the tree-based pipeline optimization tool (TPOT) to predict angiographic diagnoses of coronary artery disease (CAD). With TPOT, ML models are represented as expression trees and optimal pipelines discovered using a stochastic search method called genetic programing. We provide some guidelines for TPOT-based ML pipeline selection and optimization-based on various clinical phenotypes and high-throughput metabolic profiles in the Angiography and Genes Study (ANGES). Results: We analyzed nuclearmagnetic resonance-derived lipoprotein and metabolite profiles in the ANGES cohort with a goal to identify the role of non-obstructive CAD patients in CAD diagnostics. We performed a comparative analysis of TPOT-generated ML pipelines with selected ML classifiers, optimized with a grid search approach, applied to two phenotypic CAD profiles. As a result, TPOT-generated ML pipelines that outperformed grid search optimized models across multiple performance metrics including balanced accuracy and area under the precision-recall curve. With the selected models, we demonstrated that the phenotypic profile that distinguishes non-obstructive CAD patients from no CAD patients is associated with higher precision, suggesting a discrepancy in the underlying processes between these phenotypes. |
Year | DOI | Venue |
---|---|---|
2020 | 10.1093/bioinformatics/btz796 | BIOINFORMATICS |
DocType | Volume | Issue |
Journal | 36 | 6 |
ISSN | Citations | PageRank |
1367-4803 | 2 | 0.41 |
References | Authors | |
0 | 12 |
Name | Order | Citations | PageRank |
---|---|---|---|
Alena Orlenko | 1 | 4 | 1.17 |
Daniel Kofink | 2 | 2 | 0.41 |
Leo-Pekka Lyytikäinen | 3 | 2 | 0.41 |
Kjell Nikus | 4 | 2 | 0.41 |
Pashupati Mishra | 5 | 4 | 0.83 |
Pekka Kuukasjärvi | 6 | 2 | 0.41 |
Pekka J Karhunen | 7 | 2 | 0.41 |
Mika Kähönen | 8 | 2 | 0.41 |
Jari O Laurikka | 9 | 2 | 0.75 |
Terho Lehtimäki | 10 | 6 | 1.60 |
Folkert W. Asselbergs | 11 | 121 | 9.36 |
Jason H Moore | 12 | 119 | 16.30 |