Abstract | ||
---|---|---|
The AVATAR Information Extraction System (IES) at the IBM Almaden Research Center enables high- precision, rule-based, information extraction from text-documents. Draw ing from our experience we propose the use of probabilistic database techniques as the formal under pinnings of information extrac- tion systems so as to maintain high precision while increasing recall. This involve s building a frame- work where rule-based annotators can be mapped to queries in a databas e system. We use examples from AVATAR IES to describe the challenges in achieving this goal. Finally, we show that derivin g precision estimates in such a database system presents a significant challe nge for probabilistic database systems. |
Year | Venue | Keywords |
---|---|---|
2006 | IEEE Data Eng. Bull. | rule based,database system,information extraction,probabilistic database |
DocType | Volume | Issue |
Journal | 29 | 1 |
Citations | PageRank | References |
49 | 2.32 | 6 |
Authors | ||
5 |
Name | Order | Citations | PageRank |
---|---|---|---|
T. S. Jayram | 1 | 1373 | 75.87 |
Rajasekar Krishnamurthy | 2 | 1214 | 71.86 |
Sriram Raghavan | 3 | 1096 | 97.25 |
Shivakumar Vaithyanathan | 4 | 2518 | 234.40 |
Huaiyu Zhu | 5 | 149 | 7.05 |