Title | ||
---|---|---|
One Arm to Rule Them All: Online Learning with Multi-armed Bandits for Low-Resource Conversational Agents |
Abstract | ||
---|---|---|
In a low-resource scenario, the lack of annotated data can be an obstacle not only to train a robust system, but also to evaluate and compare different approaches before deploying the best one for a given setting. We propose to dynamically find the best approach for a given setting by taking advantage of feedback naturally present on the scenario in hand (when it exists). To this end, we present a novel application of online learning algorithms, where we frame the choice of the best approach as a multi-armed bandits problem. Our proof-of-concept is a retrieval-based conversational agent, in which the answer selection criteria available to the agent are the competing approaches (arms). In our experiment, an adversarial multi-armed bandits approach converges to the performance of the best criterion after just three interaction turns, which suggests the appropriateness of our approach in a low-resource conversational agent. |
Year | DOI | Venue |
---|---|---|
2021 | 10.1007/978-3-030-86230-5_49 | PROGRESS IN ARTIFICIAL INTELLIGENCE (EPIA 2021) |
Keywords | DocType | Volume |
Online learning, Multi-armed bandits, Conversational agents | Conference | 12981 |
ISSN | Citations | PageRank |
0302-9743 | 0 | 0.34 |
References | Authors | |
0 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Vânia Mendonça | 1 | 2 | 1.44 |
Luísa Coheur | 2 | 199 | 34.38 |
Alberto Sardinha | 3 | 36 | 8.27 |