Title
Methods for labeling error detection in microarrays based on the effect of data perturbation on the regression model.
Abstract
Mislabeled samples often appear in gene expression profile because of the similarity of different sub-type of disease and the subjective misdiagnosis. The mislabeled samples deteriorate supervised learning procedures. The LOOE-sensitivity algorithm is an approach for mislabeled sample detection for microarray based on data perturbation. However, the failure of measuring the perturbing effect makes the LOOE-sensitivity algorithm a poor performance. The purpose of this article is to design a novel detection method for mislabeled samples of microarray, which could take advantage of the measuring effect of data perturbations.To measure the effect of data perturbation, we define an index named perturbing influence value (PIV), based on the support vector machine (SVM) regression model. The Column Algorithm (CAPIV), Row Algorithm (RAPIV) and progressive Row Algorithm (PRAPIV) based on the PIV value are proposed to detect the mislabeled samples. Experimental results obtained by using six artificial datasets and five microarray datasets demonstrate that all proposed methods in this article are superior to LOOE-sensitivity. Moreover, compared with the simple SVM and CL-stability, the PRAPIV algorithm shows an increase in precision and high recall.The program and source code (in JAVA) are publicly available at http://ccst.jlu.edu.cn/CSBG/PIVS/index.htm
Year
DOI
Venue
2009
10.1093/bioinformatics/btp478
Bioinformatics
Keywords
Field
DocType
looe-sensitivity algorithm,column algorithm,microarray datasets,error detection,prapiv algorithm,regression model,mislabeled sample,row algorithm,mislabeled sample detection,data perturbation,perturbing effect,measuring effect
Data mining,Regression,Source code,Computer science,Regression analysis,Support vector machine,Error detection and correction,Supervised learning,DNA microarray,Perturbation (astronomy)
Journal
Volume
Issue
ISSN
25
20
1367-4811
Citations 
PageRank 
References 
12
0.76
7
Authors
7
Name
Order
Citations
PageRank
Chen Zhang1121.09
Wu Chunguo27812.15
Enrico Blanzieri358152.98
You Zhou4121.09
Yan Wang514318.74
Wei Du6120.76
Yanchun Liang749563.74