Title
Machine Learning-Based Method for Obesity Risk Evaluation Using Single-Nucleotide Polymorphisms Derived from Next-Generation Sequencing.
Abstract
Obesity is a major risk factor for many metabolic diseases. To understand the genetic characteristics of obese individuals, single-nucleotide polymorphisms (SNPs) derived from next-generation sequencing (NGS) provide comprehensive insight into genome-wide genetic investigation. However, interpretation of these SNP data for clinical application is difficult given the high complexity of NGS data. Hence, in this study, obesity risk prediction models based on SNPs were designed using machine learning (ML) methods, namely support vector machine (SVM), k-nearest neighbor, and decision tree (DT). This investigation obtained clinicopathological features, including 130 SNPs, sex, and age, from 139 eligible individuals. Various feature selection methods, such as stepwise multivariate linear regression (MLR), DT, and genetic algorithms, were applied to select informative features for generating obesity prediction models. Multivariate logistic regression was used to evaluate the importance of the selected features. The models trained from various features evaluated their predictive performances based on fivefold cross-validation. Three measures, namely accuracy, sensitivity, and specificity, were used to examine and compare the predictive power among various models. To design obesity prediction models using ML methods, nine SNPs, including rs10501087, rs17700144, rs2287019, rs534870, rs660339, rs7081678, rs718314, rs9816226, and rs984222, were selected based on stepwise MLR. In evaluation of model performance, the SVM model significantly outperformed other classifiers based on the same training features. The SVM model exhibits 70.77% accuracy, 80.09% sensitivity, and 63.02% specificity. This investigation has demonstrated that the selected SNPs were effective in the detection of obesity risk. Additionally, the ML-based method provides a feasible mean for conducting preliminary analyses of genetic characteristics of obesity.
Year
DOI
Venue
2018
10.1089/cmb.2018.0002
JOURNAL OF COMPUTATIONAL BIOLOGY
Keywords
Field
DocType
machine learning,next-generation sequencing (NGS),obesity,single-nucleotide polymorphisms (SNPs)
Risk evaluation,Obesity,Single-nucleotide polymorphism,DNA sequencing,Artificial intelligence,Mathematics,Machine learning,Risk factor
Journal
Volume
Issue
ISSN
25.0
12
1066-5277
Citations 
PageRank 
References 
0
0.34
0
Authors
9
Name
Order
Citations
PageRank
Hsin-Yao Wang1154.14
Shih-Cheng Chang200.34
Wan-Ying Lin392.28
Chun-Hsien Chen400.34
Szu-Hsien Chiang500.34
Kai-Yao Huang61157.91
Bo-Yu Chu700.34
Jang-Jih Lu800.34
Tzong-Yi Lee961737.18