Title
Fast and accurate identification of semi-tryptic peptides in shotgun proteomics.
Abstract
One of the major problems in shotgun proteomics is the low peptide coverage when analyzing complex protein samples. Identifying more peptides, e.g. non-tryptic peptides, may increase the peptide coverage and improve protein identification and/or quantification that are based on the peptide identification results. Searching for all potential non-tryptic peptides is, however, time consuming for shotgun proteomics data from complex samples, and poses a challenge for a routine data analysis.We hypothesize that non-tryptic peptides are mainly created from the truncation of regular tryptic peptides before separation. We introduce the notion of truncatability of a tryptic peptide, i.e. the probability of the peptide to be identified in its truncated form, and build a predictor to estimate a peptide's truncatability from its sequence. We show that our predictions achieve useful accuracy, with the area under the ROC curve from 76% to 87%, and can be used to filter the sequence database for identifying truncated peptides. After filtering, only a limited number of tryptic peptides with the highest truncatability are retained for non-tryptic peptide searching. By applying this method to identification of semi-tryptic peptides, we show that a significant number of such peptides can be identified within a searching time comparable to that of tryptic peptide identification.
Year
DOI
Venue
2008
10.1093/bioinformatics/btm545
Bioinformatics
Keywords
Field
DocType
potential non-tryptic peptides,peptide identification result,shotgun proteomics,truncated peptides,accurate identification,peptide coverage,non-tryptic peptide,non-tryptic peptides,low peptide coverage,tryptic peptides,regular tryptic peptides,semi-tryptic peptides,roc curve,data analysis
Sequence database,Proteomics,Computer science,Peptide,Complex protein,Bottom-up proteomics,Peptide spectral library,Bioinformatics,Shotgun proteomics,Peptide mass fingerprinting
Journal
Volume
Issue
ISSN
24
1
1367-4811
Citations 
PageRank 
References 
1
0.38
2
Authors
10
Name
Order
Citations
PageRank
Pedro Alves1449.37
Randy J Arnold26612.84
David E Clemmer3368.27
Yixue Li478960.24
James Reilly545743.42
Quanhu Sheng6225.61
Haixu Tang768396.67
Zhiyin Xun8317.53
Rong Zeng910.38
Predrag Radivojac1064658.89