Title
A preprocessing of outlier using KERNEL PCA and factor scores in regression model
Abstract
Data analysis including outlier is more difficult to the analysis without outlier. The outlier has a chance to increase the misclassification rate and the variance of estimate in the supervised learning like classification and regression. Also the outlier becomes a cluster in the clustering as unsupervised learning. So we are hard to represent the clustering result. Because of the previous problems, it is removed generally for constructing model in data mining. But when the outlier has some information on given data, we must not remove it from training data set. In this paper, using kernel PCA (principal component analysis) and factor scores, we propose a preprocessing method to contain the outlier in the modeling. The outlier effect of given training data set is reduced by the values of kernel PCA and factor scores. We verify improved performance of our work by the experimental results using simulation data sets in regression model.
Year
DOI
Venue
2009
10.1109/FUZZY.2009.5277180
FUZZ-IEEE
Keywords
Field
DocType
factor score,variance estimate,supervised learning,pattern clustering,regression analysis,pattern classification,simulation data set,regression model,data analysis,estimation theory,clustering method,outlier preprocessing method,training data set,data mining,kernel pca,principal component analysis,unsupervised learning,misclassification rate,kernel,data models,linear regression,training data
Data mining,Data modeling,Regression analysis,Computer science,Kernel principal component analysis,Unsupervised learning,Artificial intelligence,Cluster analysis,Pattern recognition,Outlier,Supervised learning,Machine learning,Principal component analysis
Conference
ISSN
ISBN
Citations 
1098-7584 E-ISBN : 978-1-4244-3597-5
978-1-4244-3597-5
0
PageRank 
References 
Authors
0.34
3
3
Name
Order
Citations
PageRank
Kyung-Whan Oh1203.20
Sung-Hae Jun29511.79
Yong-Jun Kim300.34