Title
Gradient boosting model for unbalanced quantitative mass spectra quality assessment
Abstract
A method for controlling the quality of isotope labeled mass spectra is described here. In such mass spectra, the profiles of labeled (heavy) and unlabeled (light) peptide pairs provide us valuable information about the studied biological samples in different conditions. The core task of quality control in quantitative LC-MS experiment is to filter out low quality spectra or the peptides with error profiles. The most common used method for this problem is training a classifier for the spectra data to separate it into positive (high quality) and negative (low quality) ones. However, the small number of error profiles always makes the training data dominated by the positive samples, i.e., class imbalance problem. So the Syntheic minority over-sampling technique (SMOTE) is employed to handle the unbalanced data and then applied extreme gradient boosting (Xgboost) model as the classifier. We assessed the different heavy-light peptide ratio samples by the trained Xgboost classifier, and found that the SMOTE Xgboost classifier increases the reliability of peptide ratio estimations significantly.
Year
DOI
Venue
2017
10.1109/SPAC.2017.8304311
2017 International Conference on Security, Pattern Analysis, and Cybernetics (SPAC)
Keywords
Field
DocType
spectra data classifier,low quality spectral filtering,labeled peptide pairs,peptide ratio estimations,SMOTE Xgboost classifier,trained Xgboost classifier,heavy-light peptide ratio samples,Syntheic minority over-sampling technique,class imbalance problem,error profiles,quantitative LC-MS experiment,quality control,unlabeled peptide pairs,isotope labeled mass spectra,unbalanced quantitative mass spectra quality assessment,gradient boosting model
Small number,Training set,Pattern recognition,Mass spectrum,Signal-to-noise ratio,Feature extraction,Boosting (machine learning),Artificial intelligence,Classifier (linguistics),Mathematics,Gradient boosting
Conference
ISBN
Citations 
PageRank 
978-1-5386-3017-4
0
0.34
References 
Authors
0
3
Name
Order
Citations
PageRank
Long Chen152849.21
Tong Zhang25318.56
Tianjun Li301.01