Title
Categorizing relational facts from the web with fuzzy rough sets
Abstract
Significant advances have been made in automatically constructing knowledge bases of relational facts derived from web corpora. These relational facts are linguistic in nature and are represented as ordered pairs of nouns (Winnipeg, Canada) belonging to a category (City_Country). One major problem is that these facts are abundant but mostly unlabeled. Hence, semi-supervised learning approaches have been successful in building knowledge bases where a small number of labeled examples are used as seed (training) instances and a large number of unlabeled instances are learnt in an iterative fashion. In this paper, we propose a novel fuzzy rough set-based semi-supervised learning algorithm (FRL) for categorizing relational facts derived from a given corpus. The proposed FRL algorithm is compared with a tolerance rough set-based learner (TPL) and the coupled pattern learner (CPL). The same ontology derived from a subset of corpus from never ending language learner system was used in all of the experiments. This paper has demonstrated that the proposed FRL outperforms both TPL and CPL in terms of precision. The paper also addresses the concept drift problem by using mutual exclusion constraints. The contributions of this paper are: (i) introduction of a formal fuzzy rough model for relations, (ii) a semi-supervised learning algorithm, (iii) experimental comparison with other machine learning algorithms: TPL and CPL, and (iv) a novel application of fuzzy rough sets.
Year
DOI
Venue
2019
10.1007/s10115-018-1250-6
Knowledge and Information Systems
Keywords
Field
DocType
Text categorization,Relational facts,Semi-supervised learning,Fuzzy rough sets,Web mining
Ontology,Semi-supervised learning,Web mining,Computer science,Fuzzy logic,Ordered pair,Coupled pattern learner,Concept drift,Rough set,Natural language processing,Artificial intelligence,Machine learning
Journal
Volume
Issue
ISSN
61.0
3.0
0219-3116
Citations 
PageRank 
References 
0
0.34
21
Authors
2
Name
Order
Citations
PageRank
Aditya Bharadwaj100.34
S. Ramanna29218.42