Title
Approximation of graph kernel similarities for chemical graphs by kernel principal component analysis
Abstract
Graph kernels have been successfully applied on chemical graphs on small to medium sized machine learning problems. However, graph kernels often require a graph transformation before the computation can be applied. Furthermore, the kernel calculation can have a polynomial complexity of degree three and higher. Therefore, they cannot be applied in large instance-based machine learning problems. By using kernel principal component analysis, we mapped the compounds to the principal components, obtaining q-dimensional real-valued vectors. The goal of this study is to investigate the correlation between the graph kernel similarities and the similarities between the vectors. In the experiments we compared the similarities on various data sets, covering a wide range of typical chemical data mining problems. The similarity matrix between the vectorial projection was computed with the Jaccard and Cosine similarity coefficient and was correlated with the similarity matrix of the original graph kernel. The main result is that there is a strong correlation between the similarities of the vectors and the original graph kernel regarding rank correlation and linear correlation. The method seems to be robust and independent of the choice of the reference subset with observed standard deviations below 5%. An important application of the approach are instance-based data mining and machine learning tasks where the computation of the original graph kernel would be prohibitive.
Year
DOI
Venue
2011
10.1007/978-3-642-20389-3_12
EvoBIO
Keywords
Field
DocType
rank correlation,graph kernel,original graph kernel,chemical graph,kernel principal component analysis,graph kernel similarity,similarity matrix,graph transformation,linear correlation,kernel calculation,standard deviation,data mining,machine learning,principal component,set cover
Graph kernel,Radial basis function kernel,Pattern recognition,Kernel embedding of distributions,Tree kernel,Kernel principal component analysis,Polynomial kernel,Artificial intelligence,String kernel,Kernel method,Machine learning,Mathematics
Conference
Volume
ISSN
Citations 
6623
0302-9743
0
PageRank 
References 
Authors
0.34
12
5
Name
Order
Citations
PageRank
Georg Hinselmann1968.12
Andreas Jahn200.34
Nikolas Fechner31038.38
Lars Rosenbaum4625.49
Andreas Zell51419137.58