Abstract | ||
---|---|---|
Neighbor embedding (NE) methods have found their use in data visualization but are limited in big data analysis tasks due to their O(n^2) complexity for n data samples. We demonstrate that the obvious approach of subsampling produces inferior results and propose a generic approximated optimization technique that reduces the NE optimization cost to O(n log n). The technique is based on realizing that in visualization the embedding space is necessarily very low-dimensional (2D or 3D), and hence efficient approximations developed for n-body force calculations can be applied. In gradient-based NE algorithms the gradient for an individual point decomposes into “forces” exerted by the other points. The contributions of close-by points need to be computed individually but far-away points can be approximated by their “center of mass”, rapidly computable by applying a recursive decomposition of the visualization space into quadrants. The new algorithm brings a significant speed-up for medium-size data, and brings “big data” within reach of visualization. |
Year | Venue | Field |
---|---|---|
2013 | ICML | Data visualization,Embedding,Visualization,Computer science,Artificial intelligence,Time complexity,Big data,Center of mass,Machine learning,Scalability,Recursive decomposition |
DocType | Citations | PageRank |
Conference | 27 | 0.85 |
References | Authors | |
11 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Zhirong Yang | 1 | 289 | 17.27 |
Jaakko Peltonen | 2 | 542 | 41.64 |
Samuel Kaski | 3 | 2755 | 245.52 |