Title
Selecting actions for resource-bounded information extraction using reinforcement learning
Abstract
Given a database with missing or uncertain content, our goal is to correct and fill the database by extracting specific information from a large corpus such as the Web, and to do so under resource limitations. We formulate the information gathering task as a series of choices among alternative, resource-consuming actions and use reinforcement learning to select the best action at each time step. We use temporal difference q-learning method to train the function that selects these actions, and compare it to an online, error-driven algorithm called SampleRank. We present a system that finds information such as email, job title and department affiliation for the faculty at our university, and show that the learning-based approach accomplishes this task efficiently under a limited action budget. Our evaluations show that we can obtain 92.4% of the final F1, by only using 14.3% of all possible actions.
Year
DOI
Venue
2012
10.1145/2124295.2124328
WSDM
Keywords
Field
DocType
final f1,resource-consuming action,limited action budget,department affiliation,specific information,best action,error-driven algorithm,possible action,use reinforcement,reinforcement learning,information gathering task,resource-bounded information extraction,selecting action,temporal difference,information extraction,web mining
Data mining,Temporal difference learning,Web mining,Information retrieval,Computer science,Information extraction,Specific-information,Artificial intelligence,Machine learning,Reinforcement learning,Bounded function
Conference
Citations 
PageRank 
References 
12
0.66
10
Authors
2
Name
Order
Citations
PageRank
Pallika H. Kanani1411.47
Andrew Kachites McCallumzy2192031588.22