Title
Prioritizing Literature Search Results Using a Training Set of Classified Documents.
Abstract
Finding relevant articles is rapidly becoming a demanding task for researchers in the biomedical field, due to the rapid expansion of the scientific literature. We investigate the use of ranking strategies for prioritizing literature search results given an initial topic of interest. Focusing on the topic of protein-protein interactions, we compared ranking strategies based on different classifiers and features. The best result obtained on the BioCreative III PPI test set was an area under the interpolated precision-recall curve of 0,629. We then analyze the use of this method for ranking the result of PubMed queries. The results shown indicate that this strategy can be used by database curators to prioritize articles for extraction of protein-protein interactions, and also by general researchers looking for publications describing protein-protein interactions within a particular area of interest.
Year
DOI
Venue
2011
10.1007/978-3-642-19914-1_49
5TH INTERNATIONAL CONFERENCE ON PRACTICAL APPLICATIONS OF COMPUTATIONAL BIOLOGY & BIOINFORMATICS (PACBB 2011)
Keywords
Field
DocType
Information Retrieval,Biomedical Literature,Protein-protein Interactions,Article Classification
Training set,Scientific literature,Ranking,Information retrieval,Computer science,Area of interest,Test set
Conference
Volume
ISSN
Citations 
93
1867-5662
0
PageRank 
References 
Authors
0.34
8
2
Name
Order
Citations
PageRank
Sérgio Matos141529.51
José Luis Oliveira276084.03