Title
Balancing Assisted Reproductive Technology Dataset For Improving The Efficiency Of Incremental Classifiers And Feature Selection Techniques
Abstract
Assisted Reproductive Technology (ART) is a set of medical procedures primarily used to address infertility. Success Rate of ART is very low because it is affected by large number of variables. Machine Learning Techniques are now applied to predict ART outcome and to find strategies to improve success rate. For this, determining the best performing classifier for ART is very important. Previously, some classifiers are applied to ART with static data. But, in reality, the datasets are dynamic in nature and require dynamic setup which can be achieved with the help of Incremental Classifiers. Due to low success rate, the ART dataset contains less number of records for positive results that make the dataset imbalanced. This research work first finds the best evaluation metric for classification on imbalanced dataset and then balances the dataset using three different balancing techniques such as undersampling, oversampling and Synthetic Minority Oversampling Technique (SMOTE) and applies five different Incremental Classifiers, namely Stochastic Gradient Descent (SGD), Stochastic Primal Estimated subGrAdient SOlver for Support vector machine (SPegasos), Naive Bayes Updatable, Instance Based (IBk), Averaged One Dependence Estimators (AIDE) Updatable and finds the best balancing technique and suitable classifier for ART outcome prediction. The result shows that for an imbalanced dataset Receiver Operating Characteristics (ROC) Area may be taken as a metric instead of the accuracy. It is found that SMOTE is best method for balancing the ART dataset and IB1 classifier performs well for the balanced data with the high prediction rate of 92.3 for ROC. Finally, various Feature Selection methods are applied to the top three best performing classifiers and suitable feature selection method for each classifier is identified.
Year
DOI
Venue
2021
10.1142/S0218126621300075
JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS
Keywords
DocType
Volume
Assisted reproductive technology (ART), imbalanced dataset, ROC, incremental classifiers, data balancing, feature selection
Journal
30
Issue
ISSN
Citations 
06
0218-1266
0
PageRank 
References 
Authors
0.34
0
3
Name
Order
Citations
PageRank
A. Suruliandi175.50
K. Ranjini201.01
S. P. Raja3126.67