Abstract | ||
---|---|---|
We regard a data pattern as a physical particle experiencing a force acting on it imposed by an overall "potential energy" of the data set, obtained via a non-parametric estimate of Renyi's entropy. The "potential energy" is called the information potential, and the forces axe called information forces, due to their information-theoretic origin. We create directed trees by selecting the predecessor of a node (pattern) according to the direction of the information force acting on the pattern. Each directed tree correspond to a cluster, hence enabling us to partition the data set. The clustering metric underlying our method is thus based on entropy, which is a quantity that conveys information about the shape of a probability density, and not only it's variance, as many traditional algorithms based on mere second order statistics rely on. We demonstrate the performance of our clustering technique when applied to both artificially created data and real data, and also discuss some limitations of the proposed method. |
Year | DOI | Venue |
---|---|---|
2003 | 10.1007/978-3-540-45063-4_5 | LECTURE NOTES IN COMPUTER SCIENCE |
Keywords | Field | DocType |
probability density,potential energy | Information theory,Combinatorics,Mathematical optimization,Data patterns,Computer science,Algorithm,Potential energy,Cluster analysis,Partition (number theory),Data partitioning,Probability density function | Conference |
Volume | ISSN | Citations |
2683 | 0302-9743 | 4 |
PageRank | References | Authors |
0.47 | 13 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Robert Jenssen | 1 | 370 | 43.06 |
Deniz Erdogmus | 2 | 1299 | 169.92 |
K E Hild | 3 | 196 | 21.18 |
José Carlos Príncipe | 4 | 841 | 102.43 |
Torbjørn Eltoft | 5 | 583 | 48.56 |