Abstract | ||
---|---|---|
This paper studies feasibility and scalable computing processes for visualizing big high dimensional data in a 3 dimensional space by using dimension reduction techniques. More specifically, we propose an unsupervised approach to compute a measure that is called visualizability in a 3 dimensional space for a high dimensional data. This measure of visualizability is computed based on the comparison of the clustering structures of the data before and after dimension reduction. The computation of visualizability requires finding an optimal clustering structure for the given data sets. Therefore, we further implement a scalable approach based on K-Means algorithm for finding an optimal clustering structure for the given big data. Then we can reduce the volume of a given big data for dimension reduction and visualization by sampling the big data based on the discovered clustering structure of the data. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1145/3006299.3006340 | Proceedings of the 3rd IEEE/ACM International Conference on Big Data Computing, Applications and Technologies |
Keywords | Field | DocType |
big data visualization, visualize high-dimensional data, big high dimensional data, visualizability, optimal clustering structure, dimension reduction | Data mining,Data set,Clustering high-dimensional data,CURE data clustering algorithm,Data visualization,Data stream clustering,Correlation clustering,Computer science,Visualization,Cluster analysis | Conference |
ISBN | Citations | PageRank |
978-1-5090-4468-9 | 0 | 0.34 |
References | Authors | |
11 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ying Xie | 1 | 47 | 14.48 |
Pooja Chenna | 2 | 0 | 0.34 |
Jing (Selena) He | 3 | 129 | 14.61 |
Linh Le | 4 | 0 | 2.03 |
Jacey Planteen | 5 | 0 | 0.34 |