Title
Missing Data Analysis Using Multiple Imputation in Relation to Parkinson's Disease
Abstract
Missing data is an omnipresent problem in neurological control diseases, such as Parkinson's Disease. Statistical analyses on the level of Parkinson's Disease may be not accurate, if no adequate method for handling missing data is applied. In order to determine a useful way to treat missing data on Parkinson's stage, we propose a multiple imputation method based on the theory of Copulas in the data pre-processing phase of the data mining process. Our goal to use the theory of Copulas is to estimate the multivariate joint probability distribution without constraints of specific types of marginal distributions of random variables that represent the dimensions of our datasets. To evaluate the proposed approach, we have compared our algorithm with seven state-of-the-art imputation methods such as mean, regression, min, max, K-nearest neighbors, Markov Chain Monte Carlo, Expected Maximization methods, on the basis of six dataset cases containing 5%, 15%, 25%, 35%, 45% and 50% missing data percentages, respectively. The accuracy of each imputation method was evaluated using the Root Mean Square Error (RMSE) formula. Our results indicate that the proposed method outperforms significantly the existing algorithms.
Year
DOI
Venue
2016
10.1145/3010089.3010117
Proceedings of the International Conference on Big Data and Advanced Wireless Technologies
Field
DocType
ISBN
Data mining,Joint probability distribution,Markov chain Monte Carlo,Regression,Multivariate statistics,Computer science,Data pre-processing,Missing data,Imputation (statistics),Statistics,Marginal distribution
Conference
978-1-4503-4779-2
Citations 
PageRank 
References 
0
0.34
4
Authors
5
Name
Order
Citations
PageRank
Rima Houari100.34
Ahcène Bounceur230635.05
Tahar M. Kechadi3273.24
Abdelkamel Tari47024.81
Reinhardt Euler59528.50