Extracting Knowledge from Web Tables Based on DOM Tree Similarity. - Citegraph

Paper Info

Title
Extracting Knowledge from Web Tables Based on DOM Tree Similarity.

Abstract
Structured (semi-structured) knowledge extraction from Web tables is an important way to obtain high quality knowledge. Unlike most extraction methods which need to understand the tables with external knowledge bases, our method uses the inherent similarities of tables to determine the semantic structure of tables. With a comprehensive analysis of table structures of various forms, we provide a novel way for calculating the DOM tree similarity between various web tables based on DTW and for clustering tables. By using 5000 Wikipedia tables which were extracted at random as the corpus, experiments show that the result of table clustering is close to the result of classification based on empirical approaches, and without the use of external knowledge bases, the quality of knowledge extracted from the tables is satisfactory.

Year	DOI	Venue
2016	10.1007/978-3-319-47650-6_24	Lecture Notes in Artificial Intelligence
Keywords	Field	DocType
Knowledge extraction,Web tables,DOM tree similarity,Table clustering	Data mining,Information retrieval,Computer science,Knowledge extraction,Document Object Model,Web tables,Cluster analysis	Conference
Volume	ISSN	Citations
9983	0302-9743	1
PageRank	References	Authors
0.39	18	5

Authors (5 rows)

Cited by (1 rows)

References (18 rows)

Name	Order	Citations	PageRank
Xiaolong Wu	1	128	18.86
Cungen Cao	2	309	58.63
Ya Wang	3	9	5.25
Jianhui Fu	4	2	0.75
Shi Wang	5	28	12.46

1