Title | ||
---|---|---|
SSCC: A Novel Computational Framework for Rapid and Accurate Clustering Large-scale Single Cell RNA-seq Data. |
Abstract | ||
---|---|---|
Clustering is a prevalent analytical means to analyze single cell RNA sequencing (scRNA-seq) data but the rapidly expanding data volume can make this process computationally challenging. New methods for both accurate and efficient clustering are of pressing need. Here we proposed Spearman subsampling-clustering-classification (SSCC), a new clustering framework based on random projection and feature construction, for large-scale scRNA-seq data. SSCC greatly improves clustering accuracy, robustness, and computational efficacy for various state-of-the-art algorithms benchmarked on multiple real datasets. On a dataset with 68,578 human blood cells, SSCC achieved 20% improvement for clustering accuracy and 50-fold acceleration, but only consumed 66% memory usage, compared to the widelyused software package SC3. Compared to k-means, the accuracy improvement of SSCC can reach 3-fold. An R implementation of SSCC is available at https://github.com/Japrin/sscClust. |
Year | DOI | Venue |
---|---|---|
2019 | 10.1016/j.gpb.2018.10.003 | Genomics, Proteomics & Bioinformatics |
Keywords | Field | DocType |
Single cell,RNA-seq,Clustering,Subsampling,Classification | Random projection,Data mining,Biology,RNA-Seq,Robustness (computer science),Software,Genetics,Cluster analysis | Journal |
Volume | Issue | ISSN |
17 | 2 | 1672-0229 |
Citations | PageRank | References |
2 | 0.64 | 0 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Xian-Wen Ren | 1 | 33 | 3.99 |
Liangtao Zheng | 2 | 2 | 0.64 |
Zemin Zhang | 3 | 50 | 6.46 |