Title
A Grid-Based Scalable Classifier For High Dimensional Datasets
Abstract
High dimensionality and large dataset size are two common characteristics of real-world datasets and databases. These characteristics pose unique challenges for the classification of such datasets. The classification algorithms that perform well (in terms of scalability and efficiency) on small and medium datasets with moderate dimensionality fail to scale well with the large and high dimensional datasets. Therefore, in this paper, we propose a scalable classifier to cope with large and high dimensional datasets. The proposed method inherits its scalability feature from the concept of grid-based partitioning. Our goals in using this method are to divide the data space into small partitions called cells and to map the data on the partitioned data space. Thus, instead of managing the individual data points within the data, abstract entities called cells are used to decrease the classification runtime for large and high dimensional datasets. The presented experimental results demonstrate the scalability and efficiency of our algorithm.
Year
DOI
Venue
2010
10.1007/978-3-642-12035-0_42
INFORMATION SYSTEMS, TECHNOLOGY AND MANAGEMENT, PROCEEDINGS
Keywords
Field
DocType
Classification, scalability, grid-based
Data point,Data mining,Data space,Computer science,Curse of dimensionality,Statistical classification,Classifier (linguistics),Grid,Scalability
Conference
Volume
ISSN
Citations 
54
1865-0929
1
PageRank 
References 
Authors
0.35
3
2
Name
Order
Citations
PageRank
Sheetal Saini110.69
Sumeet Dua227524.31