Data analysis of (non-)metric proximities at linear costs - Citegraph

Paper Info

Title
Data analysis of (non-)metric proximities at linear costs

Abstract
Domain specific (dis-)similarity or proximity measures, employed e.g. in alignment algorithms in bio-informatics, are often used to compare complex data objects and to cover domain specific data properties. Lacking an underlying vector space, data are given as pairwise (dis-)similarities. The few available methods for such data do not scale well to very large data sets. Kernel methods easily deal with metric similarity matrices, also at large scale, but costly transformations are necessary starting with non-metric (dis-) similarities. We propose an integrative combination of Nyström approximation, potential double centering and eigenvalue correction to obtain valid kernel matrices at linear costs. Accordingly effective kernel approaches, become accessible for these data. Evaluation at several larger (dis-)similarity data sets shows that the proposed method achieves much better runtime performance than the standard strategy while keeping competitive model accuracy. Our main contribution is an efficient linear technique, to convert (potentially non-metric) large scale dissimilarity matrices into approximated positive semi-definite kernel matrices.

Year	DOI	Venue
2013	10.1007/978-3-642-39140-8_4	SIMBAD
Keywords	Field	DocType
valid kernel matrix,linear cost,kernel method,large scale dissimilarity matrix,effective kernel approach,large scale,data analysis,large data set,domain specific data property,similarity data,approximated positive semi-definite kernel,metric proximity,complex data object	Kernel (linear algebra),Pairwise comparison,Mathematical optimization,Data set,Matrix (mathematics),Support vector machine,Complex data type,Algorithm,Kernel method,Eigenvalues and eigenvectors,Mathematics	Conference
Citations	PageRank	References
14	0.54	21
Authors
2

Authors (2 rows)

Cited by (14 rows)

References (21 rows)

Name	Order	Citations	PageRank
Frank-Michael Schleif	1	427	46.59
Andrej Gisbrecht	2	195	15.60

1