Reinforcement-learning agents with different temperature parameters explain the variety of human action-selection behavior in a Markov decision process task - Citegraph

Paper Info

Title
Reinforcement-learning agents with different temperature parameters explain the variety of human action-selection behavior in a Markov decision process task

Abstract
We investigated the characteristics of the human action-selection in performing a Markov decision process (MDP) task, and compared them to those of reinforcement-learning (RL) agents. The behavior of human participants was roughly classified into two qualitatively different types. On the other hand, surprisingly, the variety of human behavior could be explained simply by a single parameter of the degree of randomness (i.e., the temperature parameter) in the action-selection rules of the RL agents. This result implies that the various behaviors of human action-selection may be determined by a simple mechanism in the brain.

Year	DOI	Venue
2009	10.1016/j.neucom.2008.04.009	Neurocomputing
Keywords	Field	DocType
qualitatively different type,markov decision process,human participant,various behavior,single parameter,human action-selection behavior,different temperature parameter,human action-selection,rl agent,human behavior,action-selection rule,temperature parameter,markov decision process task,reinforcement-learning agent,action selection,reinforcement learning	Partially observable Markov decision process,Markov decision process,Artificial intelligence,Action selection,Mathematics,Machine learning,Reinforcement learning,Randomness	Journal
Volume	Issue	ISSN
72	7-9	Neurocomputing
Citations	PageRank	References
1	0.40	5
Authors
4

Authors (4 rows)

Cited by (1 rows)

References (5 rows)

Name	Order	Citations	PageRank
Fumihiko Ishida	1	3	1.18
Takahiro Sasaki	2	1	0.40
Yutaka Sakaguchi	3	26	7.81
Hiroyuki Shimai	4	10	2.11

1