Title
Weighted voting-based consensus clustering for chemical structure databases.
Abstract
The cluster-based compound selection is used in the lead identification process of drug discovery and design. Many clustering methods have been used for chemical databases, but there is no clustering method that can obtain the best results under all circumstances. However, little attention has been focused on the use of combination methods for chemical structure clustering, which is known as consensus clustering. Recently, consensus clustering has been used in many areas including bioinformatics, machine learning and information theory. This process can improve the robustness, stability, consistency and novelty of clustering. For chemical databases, different consensus clustering methods have been used including the co-association matrix-based, graph-based, hypergraph-based and voting-based methods. In this paper, a weighted cumulative voting-based aggregation algorithm (W-CVAA) was developed. The MDL Drug Data Report (MDDR) benchmark chemical dataset was used in the experiments and represented by the AlogP and ECPF_4 descriptors. The results from the clustering methods were evaluated by the ability of the clustering to separate biologically active molecules in each cluster from inactive ones using different criteria, and the effectiveness of the consensus clustering was compared to that of Ward's method, which is the current standard clustering method in chemoinformatics. This study indicated that weighted voting-based consensus clustering can overcome the limitations of the existing voting-based methods and improve the effectiveness of combining multiple clusterings of chemical structures.
Year
DOI
Venue
2014
10.1007/s10822-014-9750-2
Journal of computer-aided molecular design
Keywords
Field
DocType
Chemical dataset,Compound selection,Consensus clustering,Cumulative voting,Weighting schemes
Data mining,Fuzzy clustering,CURE data clustering algorithm,Clustering high-dimensional data,Correlation clustering,Chemistry,Consensus clustering,Artificial intelligence,Conceptual clustering,Brown clustering,Cluster analysis,Machine learning
Journal
Volume
Issue
ISSN
28
6
1573-4951
Citations 
PageRank 
References 
1
0.34
11
Authors
4
Name
Order
Citations
PageRank
Faisal Saeed13713.24
Ali Ahmed211.02
Mohd Shahir Shamsir3136.70
Naomie Salim442448.23