Abstract | ||
---|---|---|
We consider an extension of the contextual bandit setting, motivated by several practical applications, where an unlabeled history of contexts can become available for pre-training before the online decision-making begins. We propose an approach for improving the performance of contextual bandit in such setting, via adaptive, dynamic representation learning, which combines offline pre-training on unlabeled history of contexts with online selection and modification of embedding functions. Our experiments on a variety of datasets and in different nonstationary environments demonstrate clear advantages of our approach over the standard contextual bandit. |
Year | Venue | Field |
---|---|---|
2018 | arXiv: Artificial Intelligence | Embedding,Computer science,Artificial intelligence,Machine learning,Feature learning,Adaptive representation |
DocType | Volume | Citations |
Journal | abs/1802.00981 | 0 |
PageRank | References | Authors |
0.34 | 8 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Baihan Lin | 1 | 2 | 4.11 |
Guillermo A. Cecchi | 2 | 199 | 34.56 |
Djallel Bouneffouf | 3 | 4 | 8.88 |
irina rish | 4 | 912 | 81.78 |