Title
Making kernel-based vector quantization robust and effective for incomplete educational data clustering.
Abstract
Nowadays, knowledge discovered from educational data sets plays an important role in educational decision making support. One kind of such knowledge that enables us to get insights into our students' characteristics is cluster models generated by a clustering task. Each cluster model presents the groups of similar students by several aspects such as study performance, behavior, skill, etc. Many recent educational data clustering works used the existing algorithms like k-means, expectation---maximization, spectral clustering, etc. Nevertheless, none of them considered the incompleteness of the educational data gathered in an academic credit system although incomplete data handling was figured out well with several different general-purpose solutions. Unfortunately, early in-trouble student detection normally faces data incompleteness as we have collected and processed the study results of the second-, third-, and fourth-year students who have not yet accomplished the program as of that moment. In this situation, the clustering task becomes an inevitable incomplete educational data clustering task. Hence, our work focuses on an incomplete educational data clustering approach to the aforementioned task. Following kernel-based vector quantization, we define a robust effective simple solution, named VQ_fk_nps, which is able to not only handle ubiquitous data incompleteness in an iterative manner using the nearest prototype strategy but also optimize the clusters in the feature space to reach the resulting clusters with arbitrary shapes in the data space. As shown through the experimental results on real educational data sets, the clusters from our solution have better cluster quality as compared to some existing approaches.
Year
DOI
Venue
2016
10.1007/s40595-016-0060-6
Vietnam J. Computer Science
Keywords
Field
DocType
Incomplete data clustering, Educational data mining, Kernel-based vector quantization, Nearest prototype strategy, Non-spherical cluster
Data mining,Fuzzy clustering,CURE data clustering algorithm,Clustering high-dimensional data,Data stream clustering,Correlation clustering,Computer science,Artificial intelligence,Constrained clustering,Cluster analysis,Educational data mining,Machine learning
Journal
Volume
Issue
ISSN
3
2
2196-8896
Citations 
PageRank 
References 
2
0.40
18
Authors
3
Name
Order
Citations
PageRank
Thi Ngoc Chau Vo1498.68
Hua Phung Nguyen293.35
Thi Ngoc Tran Vo320.40