Title
Protein-Protein Interaction Extraction from Text by Selecting Linguistic Features
Abstract
Extracting protein-protein interactions (PPIs) from articles is important in comprehending the underlying biological processes. With advances of natural language processing, many automatic PPI extraction methods from articles such as the machine learning-based methods, including the feature-based methods and the kernel-based ones, have been developed. However, the results of these methods still need to be improved much more. We propose a novel method to extract PPIs from articles. We use many diverse features, including lexical features obtained from sentences and features obtained from parse trees. We also devise new features extracted from shortest dependency paths obtained from dependency trees. In our method, after the training data and the test data are partitioned into subsets based on the basic structures of the sentences and the process of the feature selection (FS) is performed, we decrease the values of all the features, which belong to each group of similar features, of each instance by multiplying them with corresponding shrink coefficients of features. These shrink coefficients are determined automatically. Our experimental results using five corpora show the usefulness of the proposed method.
Year
DOI
Venue
2017
10.1109/BIBE.2017.00-58
2017 IEEE 17th International Conference on Bioinformatics and Bioengineering (BIBE)
Keywords
Field
DocType
protein protein interaction,information extraction,biomedical text mining,machine learning
Kernel (linear algebra),Training set,Feature selection,Computer science,Feature extraction,Test data,Artificial intelligence,Natural language processing,Parsing,Machine learning
Conference
ISSN
ISBN
Citations 
2471-7819
978-1-5386-1325-2
0
PageRank 
References 
Authors
0.34
0
3
Name
Order
Citations
PageRank
Thi Thanh Thuy Phan100.34
Takenao Ohkawa27715.46
Akihiro Yamamoto313526.84