Title
Avatar Information Extraction System
Abstract
The AVATAR Information Extraction System (IES) at the IBM Almaden Research Center enables high- precision, rule-based, information extraction from text-documents. Draw ing from our experience we propose the use of probabilistic database techniques as the formal under pinnings of information extrac- tion systems so as to maintain high precision while increasing recall. This involve s building a frame- work where rule-based annotators can be mapped to queries in a databas e system. We use examples from AVATAR IES to describe the challenges in achieving this goal. Finally, we show that derivin g precision estimates in such a database system presents a significant challe nge for probabilistic database systems.
Year
Venue
Keywords
2006
IEEE Data Eng. Bull.
rule based,database system,information extraction,probabilistic database
DocType
Volume
Issue
Journal
29
1
Citations 
PageRank 
References 
49
2.32
6
Authors
5
Name
Order
Citations
PageRank
T. S. Jayram1137375.87
Rajasekar Krishnamurthy2121471.86
Sriram Raghavan3109697.25
Shivakumar Vaithyanathan42518234.40
Huaiyu Zhu51497.05