Title
Continuation methods for approximate large scale object sequencing
Abstract
We propose a set of highly scalable algorithms for the combinatorial data analysis problem of seriating similarity matrices. Seriation consists of finding a permutation of data instances, such that similar instances are nearby in the ordering. Applications of the seriation problem can be found in various disciplines such as in bioinformatics for genome sequencing, data visualization and exploratory data analysis. Our algorithms attempt to minimize certain p-SUM objectives, which also arise in the problem of envelope reduction of sparse matrices. In particular, we present a set of graduated non-convexity algorithms for vector-based relaxations of the general p-SUM problem for \(p \in \left\{ 2, 1, \tfrac{1}{2}\right\} \) that can scale to very large problem sizes. Different choices of p emphasize global versus local similarity pattern structure. We conduct a number of experiments to compare our algorithms to various state-of-the-art combinatorial optimization methods on real and synthetic datasets. The experimental results demonstrate that compared to other approaches, the proposed algorithms are very competitive and scale well with large problem sizes.
Year
DOI
Venue
2019
10.1007/s10994-018-5764-7
Machine Learning
Keywords
Field
DocType
Combinatorial data analysis,Combinatorial optimization,Graduated non-convexity,Sequencing,Seriation
Combinatorial data analysis,Data visualization,Pattern recognition,Matrix (mathematics),Permutation,Algorithm,Combinatorial optimization,Artificial intelligence,Exploratory data analysis,Sparse matrix,Mathematics,Seriation (archaeology)
Journal
Volume
Issue
ISSN
108.0
4
1573-0565
Citations 
PageRank 
References 
0
0.34
27
Authors
4
Name
Order
Citations
PageRank
Xenophon Evangelopoulos100.34
Austin J. Brockmeier2205.24
Tingting Mu3194.98
John Y. Goulermas400.68