Abstract | ||
---|---|---|
In this paper, we propose a set of similarity metrics for manipulating collections of values occuring in XML documents. Following the data model presented in TAX algebra, we treat an XML element as a labeled ordered rooted tree. Consider that XML nodes can be either atomic, i.e, they may contain single values such as short character strings, date, etc, or complex, i.e., nested structures that contain other nodes, we propose two types of similarity metrics: MAVs, for atomic nodes and MCVs, for complex nodes. In the first case, we suggest the use of several application domain dependent metrics. In the second case, we define metrics for complex values that are structure dependent, and can be distinctly applied for it and collections of values. We also present experiments showing the effectiveness of our method. |
Year | DOI | Venue |
---|---|---|
2004 | 10.1145/1031453.1031465 | WIDM |
Keywords | Field | DocType |
similarity metrics,xml element,tax algebra,complex value,application domain dependent metrics,atomic node,data model,complex node,xml node,xml document,xml | Data mining,XML,Information retrieval,Computer science,XML validation,Application domain,Data model | Conference |
ISBN | Citations | PageRank |
1-58113-978-0 | 19 | 1.01 |
References | Authors | |
15 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Carina F. Dorneles | 1 | 61 | 10.35 |
Carlos A. Heuser | 2 | 384 | 61.97 |
Andrei E. N. Lima | 3 | 19 | 1.01 |
Altigran Soares da Silva | 4 | 718 | 65.15 |
Edleno Silva de Moura | 5 | 988 | 75.44 |