Title
MPI-LIT: a literature-curated dataset of microbial binary protein--protein interactions
Abstract
Prokaryotic protein-protein interactions are underrepresented in currently available databases. Here, we describe a 'gold standard' dataset (MPI-LIT) focusing on microbial binary protein-protein interactions and associated experimental evidence that we have manually curated from 813 abstracts and full texts that were selected from an initial set of 36 852 abstracts. The MPI-LIT dataset comprises 1237 experimental descriptions that describe a non-redundant set of 746 interactions of which 659 (88%) are not reported in public databases. To estimate the curation quality, we compared our dataset with a union of microbial interaction data from IntAct, DIP, BIND and MINT. Among common abstracts, we achieve a sensitivity of up to 66% for interactions and 75% for experimental methods. Compared with these other datasets, MPI-LIT has the lowest fraction of interaction experiments per abstract (0.9) and the highest coverage of strains (92) and scientific articles (813). We compared methods that evaluate functional interactions among proteins (such as genomic context or co-expression) which are implemented in the STRING database. Most of these methods discriminate well between functionally relevant protein interactions (MPI-LIT) and high-throughput data.
Year
DOI
Venue
2008
10.1093/bioinformatics/btn481
BIOINFORMATICS
Field
DocType
Volume
Data mining,Protein–protein interaction,Computer science,Microbial interaction,Bioinformatics
Journal
24
Issue
ISSN
Citations 
22
1367-4803
3
PageRank 
References 
Authors
0.39
10
14