One Arm to Rule Them All: Online Learning with Multi-armed Bandits for Low-Resource Conversational Agents - Citegraph

Paper Info

Title
One Arm to Rule Them All: Online Learning with Multi-armed Bandits for Low-Resource Conversational Agents

Abstract
In a low-resource scenario, the lack of annotated data can be an obstacle not only to train a robust system, but also to evaluate and compare different approaches before deploying the best one for a given setting. We propose to dynamically find the best approach for a given setting by taking advantage of feedback naturally present on the scenario in hand (when it exists). To this end, we present a novel application of online learning algorithms, where we frame the choice of the best approach as a multi-armed bandits problem. Our proof-of-concept is a retrieval-based conversational agent, in which the answer selection criteria available to the agent are the competing approaches (arms). In our experiment, an adversarial multi-armed bandits approach converges to the performance of the best criterion after just three interaction turns, which suggests the appropriateness of our approach in a low-resource conversational agent.

Year	DOI	Venue
2021	10.1007/978-3-030-86230-5_49	PROGRESS IN ARTIFICIAL INTELLIGENCE (EPIA 2021)
Keywords	DocType	Volume
Online learning, Multi-armed bandits, Conversational agents	Conference	12981
ISSN	Citations	PageRank
0302-9743	0	0.34
References	Authors
0	3

Authors (3 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Vânia Mendonça	1	2	1.44
Luísa Coheur	2	199	34.38
Alberto Sardinha	3	36	8.27

1