Title
Mixture of logistic models and an ensemble approach for protein-protein interaction extraction
Abstract
Automatic extraction of protein-protein interaction (PPI) information from scientific literature is important for building PPI databases, studying biological networks and discovering new biological knowledge through automatic hypothesis generation. In this paper, we present a new method for PPI extraction based on a mixture of logistic models. The method automatically clusters interaction words (words that describe the interactions of protein pairs) into groups with similar grammatical properties. Logistic models are fitted for each cluster of interaction words. Directionality of interactions is an essential piece of information for many protein interactions and important for building directed biological networks. Most of current PPI extraction methods do not extract the directional information of interactions. This is in part due to the lack of specific corpora with directionality information annotated. We introduce a new corpus, PICAD, for evaluating PPI extraction tools that includes directional annotation. The corpus is available at http://stat.fsu.edu/~jinfeng/resources/PICAD.txt. In addition, we propose an ensemble approach using logistic regression, Bayesian Networks, and SVM for identifying PPIs. We show that using an ensemble of classifiers allows us to capture different features in the text and report an F-measure of 75.7% using our new corpus.
Year
DOI
Venue
2011
10.1145/2147805.2147853
BCB
Keywords
Field
DocType
directional information,new corpus,directionality information,current ppi extraction method,ppi extraction,biological network,automatic extraction,ppi databases,logistic model,ensemble approach,protein-protein interaction extraction,ppi extraction tool,logistic regression,protein protein interaction,bioinformatics,ranking,bayesian network,connectivity,centrality
Annotation,Protein–protein interaction,Ranking,Computer science,Biological network,Support vector machine,Centrality,Bayesian network,Artificial intelligence,Logistic regression,Machine learning
Conference
Citations 
PageRank 
References 
1
0.35
19
Authors
3
Name
Order
Citations
PageRank
Lindsey Bell110.35
Jinfeng Zhang28610.11
Xufeng Niu310.35