Score-based Inverse Reinforcement Learning. - Citegraph

Paper Info

Title
Score-based Inverse Reinforcement Learning.

Abstract
This paper reports theoretical and empirical results obtained for the score-based Inverse Reinforcement Learning (IRL) algorithm. It relies on a non-standard setting for IRL consisting of learning a reward from a set of globally scored trajectories. This allows using any type of policy (optimal or not) to generate trajectories without prior knowledge during data collection. This way, any existing database (like logs of systems in use) can be scored a posteriori by an expert and used to learn a reward function. Thanks to this reward function, it is shown that a near-optimal policy can be computed. Being related to least-square regression, the algorithm (called SBIRL) comes with theoretical guarantees that are proven in this paper. SBIRL is compared to standard IRL algorithms on synthetic data showing that annotations do help under conditions on the quality of the trajectories. It is also shown to be suitable for real-world applications such as the optimisation of a spoken dialogue system.

Year	DOI	Venue
2016	10.5555/2936924.2936991	AAMAS
Keywords	Field	DocType
Reinforcement Learning,Inverse Reinforcement Learning,Markov Decision Processes,Learning from Demonstration,Spoken Dialogue Systems	Data collection,Computer science,A priori and a posteriori,Markov decision process,Q-learning,Unsupervised learning,Synthetic data,Artificial intelligence,Machine learning,Learning classifier system,Reinforcement learning	Conference
ISBN	Citations	PageRank
978-1-4503-4239-1	0	0.34
References	Authors
0	5

Authors (5 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Layla El Asri	1	72	7.42
Bilal Piot	2	335	20.65
Matthieu Geist	3	385	44.31
Romain Laroche	4	110	17.35
Olivier Pietquin	5	664	68.60

1