Title
Engineering an Aligned Gold-Standard Corpus of Human to Machine Oriented Controlled Natural Language
Abstract
Knowledge base creation and population are an essential formal backbone for a variety of intelligent applications, decision support and expert systems and intelligent search. While the abundance of unstructured text helps in easing the knowledge acquisition gap, the ambiguous nature of language tends to impact accuracy when engaging in more complex semantic analysis. Controlled Natural Languages (CNLs) are subsets of natural language that are restricted grammatically in order to reduce or eliminate ambiguity for the purposes of machine processability, or unambiguous human communication within a domain or industry context, such as Simplified English. This type of human-oriented CNL is under-researched despite having found favor within industry over many years. We describe a novel dataset which aligns a representative sample of Simplified English Wikipedia sentences with a well known machine-oriented CNL. This linguistic resource is both human-readable and semantically machine interpretable and can benefit a variety of NLP and knowledge based applications.
Year
DOI
Venue
2018
10.1109/WI.2018.00-58
2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI)
Keywords
Field
DocType
Natural Language Processing, Controlled Natural Language, Knowledge Extraction, Semantic Web
Population,Controlled natural language,Information retrieval,Computer science,Expert system,Natural language,Knowledge extraction,Knowledge base,Semantics,Knowledge acquisition
Conference
ISBN
Citations 
PageRank 
978-1-5386-7326-3
0
0.34
References 
Authors
6
3
Name
Order
Citations
PageRank
Hazem Safwat162.64
Brian Davis216922.95
Manel Zarrouk3209.31