Title
The Protein Information Resource: an integrated public resource of functional annotation of proteins.
Abstract
The Protein Information Resource (PIR) serves as an integrated public resource of functional annotation of protein data to support genomic/proteomic research and scientific discovery. The PIR, in collaboration with the Munich Information Center for Protein Sequences (MIPS) and the Japan International Protein Information Database (JIPID), produces the PIR-International Protein Sequence Database (PSD), the major annotated protein sequence database in the public domain, containing about 250 000 proteins. To improve protein annotation and the coverage of experimentally validated data, a bibliography submission system is developed for scientists to submit, categorize and retrieve literature information. Comprehensive protein information is available from iProClass, which includes family classification at the superfamily, domain and motif levels, structural and functional features of proteins, as well as cross-references to over 40 biological databases. To provide timely and comprehensive protein data with source attribution, we have introduced a non-redundant reference protein database, PIR-NREF. The database consists of about 800 000 proteins collected from PIR-PSD, SWISS-PROT, TrEMBL, GenPept, RefSeq and PDB, with composite protein names and literature data. To promote database interoperability, we provide XML data distribution and open database schema, and adopt common ontologies. The PIR web site (http://pir.georgetown.edu/) features data mining and sequence analysis tools for information retrieval and functional identification of proteins based on both sequence and annotation information. The PIR databases and other files are also available by FTP (ftp://nbrfa.georgetown.edu/pir_databases).
Year
DOI
Venue
2002
10.1093/nar/30.1.35
NUCLEIC ACIDS RESEARCH
Keywords
Field
DocType
biological database,systems integration,sequence analysis,protein sequence,data mining,information retrieval,public domain,internet,proteins,amino acid sequence
RefSeq,Protein structure database,Annotation,Sequence database,Information retrieval,Biology,Munich Information Center for Protein Sequences,Biological database,Database schema,Protein Annotation,Bioinformatics,Genetics
Journal
Volume
Issue
ISSN
30
1.0
0305-1048
Citations 
PageRank 
References 
54
8.51
16
Authors
16
Name
Order
Citations
PageRank
Cathy H. Wu14169508.88
Hongzhan Huang22479346.17
Leslie Arminski320954.05
Jorge Castro-Alvear420042.21
Yongxing Chen522845.77
Zhang-Zhi Hu640865.37
Robert S. Ledley7391208.41
Kali C. Lewis8548.51
Hans-werner Mewes91992426.67
Bruce C. Orcutt1020583.39
Baris E Suzek11656111.13
Akira Tsugita1221892.14
C. R. Vinayaka1321753.65
Lai-Su L. Yeh141839286.04
Jian Zhang1535251.22
Winona C. Barker162034333.71