Title
Web information extraction using Markov logic networks
Abstract
In this paper, we consider the problem of extracting structured data from web pages taking into account both the content of individual attributes as well as the structure of pages and sites. We use Markov Logic Networks (MLNs) to capture both content and structural features in a single unified framework, and this enables us to perform more accurate inference. We show that inference in our information extraction scenario reduces to solving an instance of the maximum weight subgraph problem. We develop specialized procedures for solving the maximum subgraph variants that are far more efficient than previously proposed inference methods for MLNs that solve variants of MAX-SAT. Experiments with real-life datasets demonstrate the effectiveness of our approach.
Year
DOI
Venue
2011
10.1145/1963192.1963251
international world wide web conferences
Keywords
DocType
Citations 
structural feature,mln-based approach,markov logic network,information extraction scenario,state-of-the-art extraction method,maximum subgraph variant,specialized procedure,rich structural feature,single unified framework,markov logic networks,accurate inference,maximum weight subgraph problem,inference method,individual attribute,web information extraction,real-life datasets,information extraction
Conference
8
PageRank 
References 
Authors
0.48
16
5
Name
Order
Citations
PageRank
Sandeepkumar Satpal1583.05
Sahely Bhadra2444.64
Sundararajan Sellamanickam312714.07
Rajeev Rastogi46151827.22
Prithviraj Sen583738.24