Title
Learning to Advertise with Adaptive Exposure via Constrained Two-Level Reinforcement Learning.
Abstract
For online advertising in e-commerce, the traditional problem is to assign the right ad to the right user on fixed ad slots. In this paper, we investigate the problem of advertising with adaptive exposure, in which the number of ad slots and their locations can dynamically change over time based on their relative scores with recommendation products. In order to maintain user retention and long-term revenue, there are two types of constraints that need to be met in exposure: query-level and day-level constraints. We model this problem as constrained markov decision process with per-state constraint (psCMDP) and propose a constrained two-level reinforcement learning to decouple the original advertising exposure optimization problem into two relatively independent sub-optimization problems. We also propose a constrained hindsight experience replay mechanism to accelerate the policy training process. Experimental results show that our method can improve the advertising revenue while satisfying different levels of constraints under the real-world datasets. Besides, the proposal of constrained hindsight experience replay mechanism can significantly improve the training speed and the stability of policy performance.
Year
Venue
Field
2018
arXiv: Learning
Revenue,Mathematical optimization,Advertising,Markov decision process,Online advertising,Hindsight bias,Optimization problem,Mathematics,Reinforcement learning
DocType
Volume
Citations 
Journal
abs/1809.03149
0
PageRank 
References 
Authors
0.34
0
11
Name
Order
Citations
PageRank
Weixun Wang115.75
junqi jin21187.95
Jianye Hao318955.78
Chunjie Chen4188.88
Chuan Yu575.46
Weinan Zhang6122897.24
Jun Wang711042.73
Yixi Wang801.01
Han Li991.24
Jian Xu1002.03
Kun Gai1131220.61