Abstract | ||
---|---|---|
We present a model-free method for solving the Inverse Reinforcement Learning (IRL) problem given a set of trajectories generated by different experts' policies. In many applications, the observed demonstrations are not produced by the same policy. In fact, they may be provided by multiple experts that follow different (but similar) policies or even by the same expert that does not always replicate the same policy (e.g., human expert). We propose to model different experts' demonstrations as generated by policies sampled from a distribution. Differently from many other IRL approaches, the proposed methodology requires neither knowledge about the dynamics of the system nor to iteratively solve the direct problem for different candidate reward functions, thus providing an efficient solution to the IRL problem. |
Year | Venue | Field |
---|---|---|
2017 | 2017 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI) | Computer science,Maximum likelihood,Inverse reinforcement learning,Minification,Artificial intelligence,Multiple experts,Trajectory,Replicate |
DocType | Citations | PageRank |
Conference | 0 | 0.34 |
References | Authors | |
0 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Davide Tateo | 1 | 3 | 4.82 |
Matteo Pirotta | 2 | 78 | 18.50 |
Marcello Restelli | 3 | 416 | 61.31 |
Andrea Bonarini | 4 | 623 | 76.73 |