Abstract | ||
---|---|---|
In many commercial systems, the 'best bet' recommendations are shown, but the predicted rating values are not. This is usually referred to as a top-N recommendation task, where the goal of the recommender system is to find a few specific items which are supposed to be most appealing to the user. Common methodologies based on error metrics (such as RMSE) are not a natural fit for evaluating the top-N recommendation task. Rather, top-N performance can be directly measured by alternative methodologies based on accuracy metrics (such as precision/recall). An extensive evaluation of several state-of-the art recommender algorithms suggests that algorithms optimized for minimizing RMSE do not necessarily perform as expected in terms of top-N recommendation task. Results show that improvements in RMSE often do not translate into accuracy improvements. In particular, a naive non-personalized algorithm can outperform some common recommendation approaches and almost match the accuracy of sophisticated algorithms. Another finding is that the very few top popular items can skew the top-N performance. The analysis points out that when evaluating a recommender algorithm on the top-N recommendation task, the test set should be chosen carefully in order to not bias accuracy metrics towards non-personalized solutions. Finally, we offer practitioners new variants of two collaborative filtering algorithms that, regardless of their RMSE, significantly outperform other recommender algorithms in pursuing the top-N recommendation task, with offering additional practical advantages. This comes at surprise given the simplicity of these two methods. |
Year | DOI | Venue |
---|---|---|
2010 | 10.1145/1864708.1864721 | RecSys |
Keywords | Field | DocType |
error metrics,accuracy improvement,top-n recommendation task,recommender algorithm,state-of-the art recommender algorithm,common recommendation approach,recommender system,top-n performance,accuracy metrics,bias accuracy metrics,recall,precision,collaborative filtering,evaluation | Data mining,Computer science,Mean squared error,Artificial intelligence,Surprise,Recommender system,Collaborative filtering,Information retrieval,Algorithm,Skew,Recall,Machine learning,Test set | Conference |
Citations | PageRank | References |
535 | 16.00 | 9 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Paolo Cremonesi | 1 | 1306 | 87.23 |
Yehuda Koren | 2 | 9090 | 484.08 |
Roberto Turrin | 3 | 859 | 34.94 |