Title
On Extracting Data from HTML Tables.
Abstract
The Web provides many data in user-friendly tabular formats that are encoded using HTML. Information extractors are intended to extract those data as datasets that can feed business applications. There exist many proposals to implement them, which has motivated several previous surveys. Unfortunately, they are outdated and we do not think that it suffices to update them because they do not provide a good conceptual framework, they do not provide a taxonomy of web tables, they do not analyse the exact tasks involved, and they do not provide a good comparison framework. This article presents a review of the literature that does not have any of the previous problems, which we hope will be useful to both researchers and practitioners.
Year
Venue
DocType
2019
arXiv: Information Retrieval
Journal
Volume
Citations 
PageRank 
abs/1903.08305
0
0.34
References 
Authors
0
3
Name
Order
Citations
PageRank
Juan C. Roldán100.34
Patricia Jiménez2143.99
Rafael Corchuelo338949.87