Title
Using Machine Learning Models To Classify Stroke Risk Level Based On National Screening Data
Abstract
With the character of high incidence, high prevalence and high mortality, stroke has brought a heavy burden to families and society in China. In 2009, the Ministry of Health of China launched the China national stroke screening and intervention program, which screens stroke risk factors and conducts high-risk population interventions for people aged over 40 years old all over China. In this program, stroke risk factors include hypertension, diabetes, dyslipidemia, atrial fibrillation, smoking, lack of exercise, apparently overweight or obese and family history of stroke. People with more than two risk factors or with a history of stroke or transient ischemic attack (TIA) are considered as high-risk. However, it is impossible for this criterion to classify stroke risk level for people with "unknown" values in the fields of risk factors. The missing of stroke risk levels results in reduced efficiency of stroke interventions and inaccuracies in the statistical results at the national level. In this paper, firstly, we construct the training set and test set and process the imbalanced training set based on oversampling and undersampling method. Then, we develop logistic regression model, decision tree model, neural network model and random forest model for stroke risk classification, and evaluate these models based on the recall and precision. The results show that the model based on random forest achieves best performance considering recall and precision. The models constructed in this paper can improve the screening efficiency and avoid unnecessary rescreening and intervention expenditures.
Year
DOI
Venue
2019
10.1109/EMBC.2019.8857657
2019 41ST ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC)
Field
DocType
Volume
Computer vision,Population,Decision tree,Gerontology,Psychological intervention,Computer science,Overweight,Decision tree model,Stroke,Artificial intelligence,Random forest,Logistic regression
Conference
2019
ISSN
Citations 
PageRank 
1557-170X
0
0.34
References 
Authors
0
6
Name
Order
Citations
PageRank
Xuemeng Li102.37
Di Bian200.68
Jinghui Yu301.69
Huajian Mao402.70
Mei Li5211.53
Dongsheng Zhao601.35