Title
Sequential Decision Making With Coherent Risk.
Abstract
We provide sampling-based algorithms for optimization under a coherent-risk objective. The class of coherent-risk measures is widely accepted in finance and operations research, among other fields, and encompasses popular risk-measures such as conditional value at risk and mean-semi-deviation. Our approach is suitable for problems in which tuneable parameters control the distribution of the cost, such as in reinforcement learning or approximate dynamic programming with a parameterized policy. Such problems cannot be solved using previous approaches. We consider both static risk measures and time-consistent dynamic risk measures. For static risk measures, our approach is in the spirit of policy gradient methods, while for the dynamic risk measures, we use actor-critic type algorithms.
Year
DOI
Venue
2017
10.1109/TAC.2016.2644871
IEEE Trans. Automat. Contr.
Keywords
Field
DocType
Markov processes,Heuristic algorithms,Optimization,Dynamic programming,Standards,Electronic mail,Random variables
Dynamic programming,Mathematical optimization,Random variable,Parameterized complexity,Markov process,Computer science,Markov decision process,Sampling (statistics),Reinforcement learning,Expected shortfall
Journal
Volume
Issue
ISSN
62
7
0018-9286
Citations 
PageRank 
References 
4
0.41
31
Authors
4
Name
Order
Citations
PageRank
Aviv Tamar127524.04
Chow, Yinlam29814.03
Mohammad Ghavamzadeh381467.73
Shie Mannor43340285.45