Deep Learning From Logged Interventions - Citegraph

Paper Info

Title
Deep Learning From Logged Interventions

Abstract
Every time a system places an ad, presents a search ranking, or makes a recommendation, we can think about this as an intervention for which we can observe the useru0027s response (e.g. click, dwell time, purchase). Such logged intervention data is one of the most plentiful types of data available, as it can be recorded from a variety of systems (e.g., search engines, recommender systems, ad placement) at little cost. However, this data provides only partial-information -- aka feedback -- limited to the particular intervention chosen by the system. We donu0027t get to see how the user would have responded, if we had chosen a different intervention. This makes learning from logged bandit substantially different from conventional supervised learning, where correct predictions together with a loss function provide full-information feedback. It is also different from online learning in the bandit setting, since the algorithm does not assume interactive control of the interventions. In this talk, I will explore learning methods for batch learning from logged bandit (BLBF). Following the inductive principle of Counterfactual Risk Minimization for BLBF, this talk presents an approach to training linear models and deep networks from propensity-scored bandit feedback.

Year	DOI	Venue
2018	10.1145/3270323.3270324	PROCEEDINGS OF THE 3RD WORKSHOP ON DEEP LEARNING FOR RECOMMENDER SYSTEMS (DLRS)
Field	DocType	Citations
Recommender system,Psychological intervention,Computer science,Linear model,Supervised learning,Counterfactual thinking,Data type,Minification,Artificial intelligence,Deep learning,Machine learning	Conference	0
PageRank	References	Authors
0.34	0	1

Authors (1 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Thorsten Joachims	1	17387	1254.06

1