Abstract | ||
---|---|---|
The inference of minimum spanning arborescences within a set of objects is a general problem which translates into numerous application-specific unsupervised learning tasks. We introduce a unified and generic structure called edit arborescence that relies on edit paths between data in a collection, as well as the MINIMUM EDIT ARBORESCENCE PROBLEM, which asks for an edit arborescence that minimizes the sum of costs of its inner edit paths. Through the use of suitable cost functions, this generic framework allows to model a variety of problems. In particular, we show that by introducing encoding size preserving edit costs, it can be used as an efficient method for compressing collections of labeled graphs. Experiments on various graph datasets, with comparisons to standard compression tools, show the potential of our method. |
Year | DOI | Venue |
---|---|---|
2021 | 10.1007/978-3-030-89657-7_25 | SIMILARITY SEARCH AND APPLICATIONS, SISAP 2021 |
Keywords | DocType | Volume |
Edit arborescence, Edit distance, Lossless compression | Conference | 13058 |
ISSN | Citations | PageRank |
0302-9743 | 0 | 0.34 |
References | Authors | |
0 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Lucas Gnecco | 1 | 0 | 0.34 |
Nicolas Boria | 2 | 0 | 0.34 |
Sébastien Bougleux | 3 | 395 | 27.05 |
Florian Yger | 4 | 16 | 4.42 |
David Blumenthal | 5 | 24 | 6.26 |