Title | ||
---|---|---|
Approximation of graph kernel similarities for chemical graphs by kernel principal component analysis |
Abstract | ||
---|---|---|
Graph kernels have been successfully applied on chemical graphs on small to medium sized machine learning problems. However, graph kernels often require a graph transformation before the computation can be applied. Furthermore, the kernel calculation can have a polynomial complexity of degree three and higher. Therefore, they cannot be applied in large instance-based machine learning problems. By using kernel principal component analysis, we mapped the compounds to the principal components, obtaining q-dimensional real-valued vectors. The goal of this study is to investigate the correlation between the graph kernel similarities and the similarities between the vectors. In the experiments we compared the similarities on various data sets, covering a wide range of typical chemical data mining problems. The similarity matrix between the vectorial projection was computed with the Jaccard and Cosine similarity coefficient and was correlated with the similarity matrix of the original graph kernel. The main result is that there is a strong correlation between the similarities of the vectors and the original graph kernel regarding rank correlation and linear correlation. The method seems to be robust and independent of the choice of the reference subset with observed standard deviations below 5%. An important application of the approach are instance-based data mining and machine learning tasks where the computation of the original graph kernel would be prohibitive. |
Year | DOI | Venue |
---|---|---|
2011 | 10.1007/978-3-642-20389-3_12 | EvoBIO |
Keywords | Field | DocType |
rank correlation,graph kernel,original graph kernel,chemical graph,kernel principal component analysis,graph kernel similarity,similarity matrix,graph transformation,linear correlation,kernel calculation,standard deviation,data mining,machine learning,principal component,set cover | Graph kernel,Radial basis function kernel,Pattern recognition,Kernel embedding of distributions,Tree kernel,Kernel principal component analysis,Polynomial kernel,Artificial intelligence,String kernel,Kernel method,Machine learning,Mathematics | Conference |
Volume | ISSN | Citations |
6623 | 0302-9743 | 0 |
PageRank | References | Authors |
0.34 | 12 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Georg Hinselmann | 1 | 96 | 8.12 |
Andreas Jahn | 2 | 0 | 0.34 |
Nikolas Fechner | 3 | 103 | 8.38 |
Lars Rosenbaum | 4 | 62 | 5.49 |
Andreas Zell | 5 | 1419 | 137.58 |