Title
Improving Classifier Performance by Autonomously Collecting Background Knowledge from the Web
Abstract
Many websites allow users to tag data items to make them easier to find. In this paper we consider the problem of classifying tagged data according to user-specified interests. We present an approach for aggregating background knowledge from the Web to improve the performance of a classier. In previous work, researchers have developed technology for extracting knowledge, in the form of relational tables, from semi-structured websites. In this paper we integrate this extraction technology with generic machine learning algorithms, showing that knowledge extracted from the Web can significantly benefit the learning process. Specifically, the knowledge can lead to better generalizations, reduce the number of samples required for supervised learning, and eliminate the need to retrain the system when the environment changes. We validate the approach with an application that classifies tagged Fickr data.
Year
DOI
Venue
2011
10.1109/ICMLA.2011.76
ICMLA (1)
Keywords
Field
DocType
environment change,better generalization,generic machine,data item,aggregating background knowledge,extraction technology,improving classifier performance,relational table,fickr data,autonomously collecting background knowledge,previous work,supervised learning,data mining,learning artificial intelligence,machine learning,knowledge extraction,information retrieval,ontologies,knowledge engineering,information extraction
Ontology (information science),World Wide Web,Computer science,Generalization,Supervised learning,Information extraction,Knowledge engineering,Artificial intelligence,Classifier (linguistics),Machine learning
Conference
Citations 
PageRank 
References 
0
0.34
11
Authors
6
Name
Order
Citations
PageRank
Steven Minton13473536.74
Matthew Michelson240922.23
Kane See341.61
Sofus A. Macskassy461347.11
Bora C. Gazen500.34
Lise Getoor64365320.21