Balanced Linear Contextual Bandits. - Citegraph

Paper Info

Title
Balanced Linear Contextual Bandits.

Abstract
Contextual bandit algorithms are sensitive to the estimation method of the outcome model as well as the exploration method used, particularly in the presence of rich heterogeneity or complex outcome models, which can lead to difficult estimation problems along the path of learning. We develop algorithms for contextual bandits with linear payoffs that integrate balancing methods from the causal inference literature in their estimation to make it less prone to problems of estimation bias. We provide the first regret bound analyses for linear contextual bandits with balancing and show that our algorithms match the state of the art theoretical guarantees. We demonstrate the strong practical advantage of balanced contextual bandits on a large number of supervised learning datasets and on a synthetic example that simulates model mis-specification and prejudice in the initial training data.

Year	Venue	Field
2018	THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE	Training set,Causal inference,Regret,Computer science,Supervised learning,Artificial intelligence,Prejudice (legal term),Machine learning
DocType	Volume	Citations
Journal	abs/1812.06227	0
PageRank	References	Authors
0.34	0	4

Authors (4 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Maria Dimakopoulou	1	10	3.61
Zhengyuan Zhou	2	141	19.63
Susan Athey	3	23	4.67
Guido Imbens	4	5	1.14

1