Title
Clustering support vector machines and its application to local protein tertiary structure prediction
Abstract
Support Vector Machines (SVMs) are new generation of machine learning techniques and have shown strong generalization capability for many data mining tasks. SVMs can handle nonlinear classification by implicitly mapping input samples from the input feature space into another high dimensional feature space with a nonlinear kernel function. However, SVMs are not favorable for huge datasets with over millions of samples. Granular computing decomposes information in the form of some aggregates and solves the targeted problems in each granule. Therefore, we propose a novel computational model called Clustering Support Vector Machines (CSVMs) to deal with the complex classification problems for huge datasets. Taking advantage of both theory of granular computing and advanced statistical learning methodology, CSVMs are built specifically for each information granule partitioned intelligently by the clustering algorithm. This feature makes learning tasks for each CSVMs more specific and simpler. Moreover, CSVMs built particularly for each granule can be easily parallelized so that CSVMs can be used to handle huge datasets efficiently. The CSVMs model is used for predicting local protein tertiary structure. Compared with the conventional clustering method, the prediction accuracy for local protein tertiary structure has been improved noticeably when the new CSVM model is used. The encouraging experimental results indicate that our new computational model opens a new way to solve the complex classification for huge datasets.
Year
DOI
Venue
2006
10.1007/11758525_96
International Conference on Computational Science (2)
Keywords
Field
DocType
complex classification,high dimensional feature space,clustering support vector machine,complex classification problem,new computational model,new csvm model,novel computational model,new generation,csvms model,huge datasets,local protein tertiary structure,kernel function,computer model,feature space,data mining,machine learning,granular computing,support vector machine
Data mining,Feature vector,Nonlinear system,Computer science,Support vector machine,Information extraction,Granular computing,Artificial intelligence,Cluster analysis,Sequential minimal optimization,Machine learning,Kernel (statistics)
Conference
Volume
ISSN
ISBN
3992
0302-9743
3-540-34381-4
Citations 
PageRank 
References 
0
0.34
11
Authors
5
Name
Order
Citations
PageRank
Jieyue He112818.92
Wei Zhong200.34
Robert Harrison300.34
Phang C. Tai410211.10
Yi Pan52507203.23