Using structural similarity for clustering XML documents - Citegraph

Paper Info

Title
Using structural similarity for clustering XML documents

Abstract
In this paper, we describe a method for clustering XML documents. Its goal is to group documents sharing similar structures. Our approach is two-step. We first automatically extract the structure from each XML document to be classified. This extracted structure is then used as a representation model to classify the corresponding XML document. The idea behind the clustering is that if XML documents share similar structures, they are more likely to correspond to the structural part of the same query. Finally, for the experimentation purpose, we tested our algorithms on both real (ACM SIGMOD Record corpus) and synthetic data. The results clearly demonstrate the interest of our approach.

Year	DOI	Venue
2012	10.1007/s10115-011-0421-5	Knowl. Inf. Syst.
Keywords	Field	DocType
group document,structural similarity,clustering xml document,structural part,representation model,corresponding xml document,synthetic data,similar structure,xml document,clustering · context · node · similarity · structural classification · threshold · tree,experimentation purpose,acm sigmod record corpus	Fuzzy clustering,Data mining,XML Schema (W3C),Information retrieval,XML,Computer science,XML validation,Structural classification,Synthetic data,Simple API for XML,Cluster analysis	Journal
Volume	Issue	ISSN
32	1	0219-3116
Citations	PageRank	References
7	0.43	41
Authors
4

Authors (4 rows)

Cited by (7 rows)

References (41 rows)

Name	Order	Citations	PageRank
Ali Aïtelhadj	1	8	0.77
Mohand Boughanem	2	923	109.00
Mohamed Mezghiche	3	25	11.68
Fatiha Souam	4	8	1.11

1