Dynamic reward shaping: training a robot by voice - Citegraph

Paper Info

Title
Dynamic reward shaping: training a robot by voice

Abstract
Reinforcement Learning is commonly used for learning tasks in robotics, however, traditional algorithms can take very long training times. Reward shaping has been recently used to provide domain knowledge with extra rewards to converge faster. The reward shaping functions are normally defined in advance by the user and are static. This paper introduces a dynamic reward shaping approach, in which these extra rewards are not consistently given, can vary with time and may sometimes be contrary to what is needed for achieving a goal. In the experiments, a user provides verbal feedback while a robot is performing a task which is translated into additional rewards. It is shown that we can still guarantee convergence as long as most of the shaping rewards given per state are consistent with the goals and that even with fairly noisy interaction the system can still produce faster convergence times than traditional reinforcement learning techniques.

Year	DOI	Venue
2010	10.1007/978-3-642-16952-6_49	IBERAMIA
Keywords	Field	DocType
verbal feedback,traditional algorithm,additional reward,extra reward,convergence time,dynamic reward,traditional reinforcement,domain knowledge,reinforcement learning,long training time	Convergence (routing),Domain knowledge,Computer science,Artificial intelligence,Robot,Robotics,Reinforcement learning	Conference
Volume	ISSN	ISBN
6433	0302-9743	3-642-16951-1
Citations	PageRank	References
17	0.98	12
Authors
3

Authors (3 rows)

Cited by (17 rows)

References (12 rows)

Name	Order	Citations	PageRank
Ana C. Tenorio-Gonzalez	1	17	1.66
Eduardo F. Morales	2	559	57.67
Luis Villaseñor-Pineda	3	403	53.74

1