Meta Dynamic Pricing: Learning Across Experiments. - Citegraph

Paper Info

Title
Meta Dynamic Pricing: Learning Across Experiments.

Abstract
We study the problem of learning emph{across} a sequence of price experiments for related products, focusing on implementing the Thompson sampling algorithm for dynamic pricing. We consider a practical formulation of this problem where the unknown parameters of the demand function for each product come from a prior that is shared across products, but is unknown a priori. Our main contribution is a meta dynamic pricing algorithm that learns this prior online while solving a sequence of non-overlapping pricing experiments (each with horizon $T$) for $N$ different products. Our algorithm addresses two challenges: (i) balancing the need to learn the prior (emph{meta-exploration}) with the need to leverage the current estimate of the prior to achieve good performance (emph{meta-exploitation}), and (ii) accounting for uncertainty in the estimated prior by appropriately widening the prior as a function of its estimation error, thereby ensuring convergence of each price experiment. We prove that the price of an unknown prior for Thompson sampling is negligible in experiment-rich environments (large $N$). In particular, our algorithmu0027s meta regret can be upper bounded by $widetilde{O}left(sqrt{NT}right)$ when the covariance of the prior is known, and $widetilde{O}left(N^{frac{3}{4}}sqrt{T}right)$ otherwise. Numerical experiments on synthetic and real auto loan data demonstrate that our algorithm significantly speeds up learning compared to prior-independent algorithms or a naive approach of greedily using the updated prior across products.

Year	Venue	DocType
2019	arXiv: Learning	Journal
Volume	Citations	PageRank
abs/1902.10918	0	0.34
References	Authors
13	3

Authors (3 rows)

Cited by (0 rows)

References (13 rows)

Name	Order	Citations	PageRank
Hamsa Bastani	1	16	2.72
David Simchi-Levi	2	1449	151.53
ruihao zhu	3	14	3.27

1