Title
Ontology-Based Supervised Concept Learning for the Biogeochemical Literature
Abstract
Academic literature search is a vital step of every research project, especially in the face of the increasingly rapid growth of scientific knowledge. Semantic academic literature search is an approach to scientific article retrieval and ranking using concepts in an attempt to address well-known deficiencies of keyword-based search. The difficulty of semantic search, however, is that it requires significant knowledge engineering, often in the form of conceptual ontologies tailored to a particular scientific domain. It also requires non-trivial tuning, in the form of domain-specific term and concepts weights. As part of an ongoing project seeking to build a domain-specific semantic search system, we present an ontology-based supervised concept learning approach for the biogeochemical scientific literature. We first discuss the creation of a dataset of scientific articles in the biogeochemical domain annotated using the Environment Ontology (ENVO). Next we present a supervised machine learning classifier-a random decision forest-that uses a distinctive set of features to learn ENVO concepts and then label and index scientific articles at the sentence level. Finally, we evaluate our approach against two baseline methods, keyword-based and bag-of-words, achieving an overall performance of 0.76 F_1 measure, an improvement of approximately 50%.
Year
DOI
Venue
2018
10.1109/IRI.2018.00066
2018 IEEE International Conference on Information Reuse and Integration (IRI)
Keywords
Field
DocType
Natural Language processing,Semantic Search,Academic Search,Ontologies,Machine Learning
Ontology (information science),Ontology,Scientific literature,Ranking,Semantic search,Information retrieval,Sociology of scientific knowledge,Computer science,Concept learning,Artificial intelligence,Knowledge engineering,Machine learning
Conference
ISBN
Citations 
PageRank 
978-1-5386-2660-3
0
0.34
References 
Authors
17
7