Towards reproducibility in recommender-systems research. - Citegraph

Paper Info

Title
Towards reproducibility in recommender-systems research.

Abstract
Numerous recommendation approaches are in use today. However, comparing their effectiveness is a challenging task because evaluation results are rarely reproducible. In this article, we examine the challenge of reproducibility in recommender-system research. We conduct experiments using Plista's news recommender system, and Docear's research-paper recommender system. The experiments show that there are large discrepancies in the effectiveness of identical recommendation approaches in only slightly different scenarios, as well as large discrepancies for slightly different approaches in identical scenarios. For example, in one news-recommendation scenario, the performance of a content-based filtering approach was twice as high as the second-best approach, while in another scenario the same content-based filtering approach was the worst performing approach. We found several determinants that may contribute to the large discrepancies observed in recommendation effectiveness. Determinants we examined include user characteristics (gender and age), datasets, weighting schemes, the time at which recommendations were shown, and user-model size. Some of the determinants have interdependencies. For instance, the optimal size of an algorithms' user model depended on users' age. Since minor variations in approaches and scenarios can lead to significant changes in a recommendation approach's performance, ensuring reproducibility of experimental results is difficult. We discuss these findings and conclude that to ensure reproducibility, the recommender-system community needs to (1) survey other research fields and learn from them, (2) find a common understanding of reproducibility, (3) identify and understand the determinants that affect reproducibility, (4) conduct more comprehensive experiments, (5) modernize publication practices, (6) foster the development and use of recommendation frameworks, and (7) establish best-practice guidelines for recommender-systems research.

Year	DOI	Venue
2016	10.1007/s11257-016-9174-x	User Model. User-Adapt. Interact.
Keywords	Field	DocType
Recommender systems,Evaluation,Experimentation,Reproducibility	Interdependence,Recommender system,Data mining,Reproducibility,Weighting,Computer science,Filter (signal processing),Artificial intelligence,User modeling,Machine learning	Journal
Volume	Issue	ISSN
26	1	0924-1868
Citations	PageRank	References
15	0.67	65
Authors
5

Authors (5 rows)

Cited by (15 rows)

References (65 rows)

Name	Order	Citations	PageRank
Jöran Beel	1	65	9.60
Corinna Breitinger	2	214	15.98
Stefan Langer	3	97	8.10
Andreas Lommatzsch	4	479	40.83
Bela Gipp	5	432	51.77

1