Title
Toward Semantic Search for the Biogeochemical Literature
Abstract
Literature search is a vital step of every research project. Semantic literature search is an approach to article retrieval and ranking using concepts rather than keywords, in an attempt to address the well-known deficiencies of keyword-based search, namely, (1) retrieval of an overwhelming number of results, (2) rankings that do not precisely reflect true relevance, and (3) the omission of relevant results because they do not contain the idiosyncratic keywords of the query. The difficulty of semantic search, however, is that it requires significant knowledge engineering, often in the form of conceptual ontologies tailored to a particular scientific domain. It also requires non-trivial tuning, in the form of domain-specific term and concepts weights. Here we present preliminary, work-in-progress results in the development of a semantic search system for the biogeochemical scientific literature. We report the following initial steps: first, one of the co-authors-a biogeochemistry expert-wrote a sample search query, and ranked the five most relevant articles that were returned for that query from a popular keyword-based search engine. We then hand annotated the five articles and the query with the Environmental Ontology (ENVO), an existing ontology for the domain. Critically, this pilot annotation revealed a number of missing concepts that we will add in future work. We then showed that a straightforward ontology distance metric between concepts in the search query and the five articles was sufficient to produce the expected ranking. We discuss the implications of these results, and outline next steps required produce a full-fledged semantic search system for the biogeochemistry scientific literature.
Year
DOI
Venue
2017
10.1109/IRI.2017.49
2017 IEEE International Conference on Information Reuse and Integration (IRI)
Keywords
Field
DocType
Natural Language Processing,Semantic Search,Ontologies
Data mining,Phrase search,Computer science,Ranking (information retrieval),Artificial intelligence,Ontology (information science),Web search query,Information retrieval,Query expansion,Semantic search,Search analytics,Concept search,Machine learning
Conference
ISBN
Citations 
PageRank 
978-1-5386-1563-8
1
0.35
References 
Authors
14
7