Title
Extraction Of Protein-Protein Interactions Using Natural Language Processing Based Pattern Matching
Abstract
a significant part of our knowledge is relationships between two terms. However, most of these information is documented as unstructured text in various forms, like books, online articles and webpages. Extract those information and store them in a structured database could help people utilize these information more conveniently. In this study, we proposed a novel approach to extract the relationships information based on Nature Language Processing (NLP) and graph theoretic algorithm. Our method, Grammatical Relationship Graph for Triplets (GRGT), extracts three layers of information: the pairs of terms that have certain relationship, exactly what type of the relationship is, and what direct this relationship is. GRGT works on a grammatical graph obtained by parsed the sentence using Natural Language Processing. Patterns were extracted from the graph by shortest path among the words of interests. We have designed a decision tree to make the pattern matching. GRGT was applied to extract the protein-protein-interactions (PPIs) from biomedical literature, and obtained better precision than the best performing method in literature. Beyond extracting PPIs, our method could be easily extended to extracting relationship information between other bioentities.
Year
DOI
Venue
2017
10.1109/BIBM.2017.8217847
2017 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM)
Keywords
DocType
ISSN
Information extraction, relationship extraction, protein-protein-interactions, Nature Language Processing, graph theorem algorithm
Conference
2156-1125
Citations 
PageRank 
References 
0
0.34
0
Authors
4
Name
Order
Citations
PageRank
Kaixian Yu100.34
Tingting Zhao251.75
Peixiang Zhao381436.08
Jinfeng Zhang48610.11