Ingredients for accurate, fast, and robust XML similarity joins - Citegraph

Paper Info

Title
Ingredients for accurate, fast, and robust XML similarity joins

Abstract
We consider the problem of answering similarity join queries on large, non-schematic, heterogeneous XML datasets. Realizing similarity joins on such datasets is challenging, because the semi-structured nature of XML substantially increases the complexity of the underlying similarity function in terms of both effectiveness and efficiency. Moreover, even the selection of pieces of information for similarity assessment is complicated because these can appear at different parts among documents in a dataset. In this paper, we present an approach that jointly calculates textual and structural similarity of XML trees while implicitly embedding similarity selection into join processing. We validate the accuracy, performance, and scalability of our techniques with a set of experiments in the context of an XML DBMS.

Year	DOI	Venue
2011	10.1007/978-3-642-23091-2_3	DEXA (2)
Keywords	Field	DocType
different part,realizing similarity,xml dbms,structural similarity,xml tree,semi-structured nature,similarity assessment,embedding similarity selection,underlying similarity function,robust xml similarity,heterogeneous xml datasets,xml	Semantic similarity,Data mining,Joins,Efficient XML Interchange,Information retrieval,XML,Computer science,XML validation,XML database,XML schema,Database,Scalability	Conference
Citations	PageRank	References
0	0.34	12
Authors
2

Authors (2 rows)

Cited by (0 rows)

References (12 rows)

Name	Order	Citations	PageRank
Leonardo Andrade Ribeiro	1	45	8.62
Theo Härder	2	1132	307.12

1