Data sparsity issues in the collaborative filtering framework - Citegraph

Paper Info

Title
Data sparsity issues in the collaborative filtering framework

Abstract
With the amount of available information on the Web growing rapidly with each day, the need to automatically filter the information in order to ensure greater user efficiency has emerged. Within the fields of user profiling and Web personalization several popular content filtering techniques have been developed. In this chapter we present one of such techniques – collaborative filtering. Apart from giving an overview of collaborative filtering approaches, we present the experimental results of confronting the k-Nearest Neighbor (kNN) algorithm with Support Vector Machine (SVM) in the collaborative filtering framework using datasets with different properties. While the k-Nearest Neighbor algorithm is usually used for collaborative filtering tasks, Support Vector Machine is considered a state-of-the-art classification algorithm. Since collaborative filtering can also be interpreted as a classification/regression task, virtually any supervised learning algorithm (such as SVM) can also be applied. Experiments were performed on two standard, publicly available datasets and, on the other hand, on a real-life corporate dataset that does not fit the profile of ideal data for collaborative filtering. We conclude that the quality of collaborative filtering recommendations is highly dependent on the sparsity of available data. Furthermore, we show that kNN is dominant on datasets with relatively low sparsity while SVM-based approaches may perform better on highly sparse data.

Year	DOI	Venue
2005	10.1007/11891321_4	WEBKDD
Keywords	Field	DocType
available data,state-of-the-art classification algorithm,sparse data,support vector machine,available information,supervised learning algorithm,available datasets,k-nearest neighbor algorithm,data sparsity issue,ideal data,greater user efficiency,k nearest neighbor,support vector,supervised learning,col,web personalization,collaborative filtering	Recommender system,Data mining,Collaborative filtering,Computer science,Collaborative software,Support vector machine,Sparse approximation,Filter (signal processing),Supervised learning,Artificial intelligence,Machine learning,Information filtering system	Conference
Volume	ISSN	ISBN
4198	0302-9743	3-540-46346-1
Citations	PageRank	References
37	1.95	18
Authors
4

Authors (4 rows)

Cited by (37 rows)

References (18 rows)

Name	Order	Citations	PageRank
Miha Grcar	1	224	15.71
Dunja Mladenic	2	1484	170.14
Blaž Fortuna	3	127	9.55
Marko Grobelnik	4	1032	126.90

1