Title
A framework for the automatic extraction of rules from online text
Abstract
The majority of knowledge on the Web is encoded in unstructured text and is not linked to formalized knowledge, such as ontologies and rules. A potential solution to this problem is to acquire this knowledge through natural language processing and text mining methods. Prior work has focused on automatically extracting RDF- or OWL-based ontologies from text; however, the type of knowledge acquired is generally restricted to simple term hierarchies. This paper presents a general-purpose framework for acquiring more complex relationships from text and then encoding this knowledge as rules. Our approach starts with existing domain knowledge in the form of OWL ontologies and Semantic Web Rule Language (SWRL) rules and applies natural language processing and text matching techniques to deduce classes and properties. It then captures deductive knowledge in the form of new rules. We have evaluated our framework by applying it to web-based text on car rental requirements. We show that our approach can automatically and accurately generate rules for requirements of car rental companies not in the knowledge base. Our framework thus rapidly acquires complex knowledge from free text sources. We are expanding it to handle richer domains, such as medical science.
Year
DOI
Venue
2011
10.1007/978-3-642-22546-8_21
RuleML Europe
Keywords
Field
DocType
free text source,complex knowledge,automatic extraction,knowledge base,natural language processing,formalized knowledge,online text,domain knowledge,text mining method,web-based text,unstructured text,captures deductive knowledge
Text graph,Data mining,Computer science,Artificial intelligence,Natural language processing,Knowledge base,Semantic Web Rule Language,Ontology (information science),Domain knowledge,Information retrieval,Knowledge-based systems,Knowledge extraction,Knowledge acquisition,Database
Conference
Volume
ISSN
Citations 
6826
0302-9743
5
PageRank 
References 
Authors
0.63
17
3
Name
Order
Citations
PageRank
Saeed Hassanpour1315.76
Martin J. O'Connor253657.50
Amar K. Das342051.09