Title
Boosting multiclass learning with repeating codes and weak detectors for protein subcellular localization.
Abstract
Determining locations of protein expression is essential to understand protein function. Advances in green fluorescence protein (GFP) fusion proteins and automated fluorescence microscopy allow for rapid acquisition of large collections of protein localization images. Recognition of these cell images requires an automated image analysis system. Approaches taken by previous work concentrated on designing a set of optimal features and then applying standard machine-learning algorithms. In fact, trends of recent advances in machine learning and computer vision can be applied to improve the performance. One trend is the advances in multiclass learning with error-correcting output codes (ECOC). Another trend is the use of a large number of weak detectors with boosting for detecting objects in images of real-world scenes.We take advantage of these advances to propose a new learning algorithm, AdaBoost.ERC, coupled with weak and strong detectors, to improve the performance of automatic recognition of protein subcellular locations in cell images. We prepared two image data sets of CHO and Vero cells and downloaded a HeLa cell image data set in the public domain to evaluate our new method. We show that AdaBoost.ERC outperforms other AdaBoost extensions. We demonstrate the benefit of weak detectors by showing significant performance improvements over classifiers using only strong detectors. We also empirically test our method's capability of generalizing to heterogeneous image collections. Compared with previous work, our method performs reasonably well for the HeLa cell images.CHO and Vero cell images, their corresponding feature sets (SSLF and WSLF), our new learning algorithm, AdaBoost.ERC, and Supplementary Material are available at http://aiia.iis.sinica.edu.tw/
Year
DOI
Venue
2007
10.1093/bioinformatics/btm497
Bioinformatics
Keywords
Field
DocType
previous work,vero cell,fusion protein,cell image,hela cell image data,edu.tw/ contact: chunnan@iis.sinica.edu.tw supplementary information: supplementary data are available at bioinformatics online.,hela cell image,weak detector,protein subcellular localization,vero cell image,new learning algorithm,strong detector,boosting multiclass,green fluorescent protein,machine learning,public domain,image recognition,protein expression,image analysis,protein localization,computer vision
Data set,AdaBoost,Computer science,Generalization,Protein subcellular localization prediction,Protein expression,Protein function,Boosting (machine learning),Bioinformatics,Detector
Journal
Volume
Issue
ISSN
23
24
1367-4811
Citations 
PageRank 
References 
16
0.84
19
Authors
8
Name
Order
Citations
PageRank
Chung-Chih Lin120423.07
Yuh-Show Tsai21038.81
Yu-Shi Lin3844.52
Tai-Yu Chiu4241.70
Chia-Cheng Hsiung5160.84
May-I Lee6160.84
Jeremy C. Simpson7283.76
C. N. Hsu81233157.54