Title
Entropy-based consensus clustering for patient stratification.
Abstract
Motivation: Patient stratification or disease subtyping is crucial for precision medicine and personalized treatment of complex diseases. The increasing availability of high-throughput molecular data provides a great opportunity for patient stratification. Many clustering methods have been employed to tackle this problem in a purely data-driven manner. Yet, existing methods leveraging high-throughput molecular data often suffers from various limitations, e.g. noise, data heterogeneity, high dimensionality or poor interpretability. Results: Here we introduced an Entropy-based Consensus Clustering (ECC) method that overcomes those limitations all together. Our ECC method employs an entropy-based utility function to fuse many basic partitions to a consensus one that agrees with the basic ones as much as possible. Maximizing the utility function in ECC has a much more meaningful interpretation than any other consensus clustering methods. Moreover, we exactly map the complex utility maximization problem to the classic K-means clustering problem, which can then be efficiently solved with linear time and space complexity. Our ECC method can also naturally integrate multiple molecular data types measured from the same set of subjects, and easily handle missing values without any imputation. We applied ECC to 110 synthetic and 48 real datasets, including 35 cancer gene expression benchmark datasets and 13 cancer types with four molecular data types from The Cancer Genome Atlas. We found that ECC shows superior performance against existing clustering methods. Our results clearly demonstrate the power of ECC in clinically relevant patient stratification.
Year
DOI
Venue
2017
10.1093/bioinformatics/btx167
BIOINFORMATICS
Field
DocType
Volume
Data mining,Computer science,Curse of dimensionality,Utility maximization problem,Data type,Consensus clustering,Imputation (statistics),Missing data,Time complexity,Cluster analysis
Journal
33
Issue
ISSN
Citations 
17
1367-4803
8
PageRank 
References 
Authors
0.57
7
6
Name
Order
Citations
PageRank
Hongfu Liu128128.65
Rui Zhao280.57
Hongsheng Fang3173.07
Feixiong Cheng429921.70
Yun Fu54267208.09
Yang-Yu Liu61199.57