Title
Pictor: an interactive system for importing data from a website
Abstract
We present a demonstration of an interactive wrapper induction system, called Pictor, which is able to minimize labeling cost, yet extract data with high accuracy from a website. Our demonstration will introduce two proposed technologies: record-level wrappers and a wrapper-assisted labeling strategy. These approaches allow Pictor to exploit previously generated wrappers, in order to predict similar labels in a partially labeled webpage or a completely new webpage. Our experiment results show the effectiveness of the Pictor system.
Year
DOI
Venue
2008
10.1145/1401890.1402028
KDD
Keywords
Field
DocType
proposed technology,new webpage,record-level wrapper,experiment result,interactive wrapper induction system,pictor system,interactive system,similar label,high accuracy,information extraction
Data mining,World Wide Web,Web page,Information retrieval,Computer science,Exploit,Information extraction
Conference
Citations 
PageRank 
References 
1
0.37
9
Authors
4
Name
Order
Citations
PageRank
Shuyi Zheng125611.22
Matthew R. Scott29310.84
Ruihua Song3113859.33
Ji-Rong Wen44431265.98