Abstract | ||
---|---|---|
We consider Competitive Markov Decision Processes in which the controllers/players are antagonistic and aggregate their sequences of expected rewards according to “weighted” or “horizonsensitive” criteria. These are either a convex combination of two discounted objectives, or of one discounted and one limiting average reward objective. In both cases we establish the existence of the game-theoretic value vector, and supply a description of 6-optimal non-stationary strategies. |
Year | DOI | Venue |
---|---|---|
1992 | 10.1007/BF01416234 | Math. Meth. of OR |
Keywords | Field | DocType |
convex combination,markov decision process | Mathematical economics,Mathematical optimization,Weighting,Convex combination,Markov decision process,Game theory,Decision process,Mathematics,Limiting,Stochastic game | Journal |
Volume | Issue | Citations |
36 | 4 | 9 |
PageRank | References | Authors |
5.53 | 0 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jerzy A. Filar | 1 | 149 | 29.59 |
O. J. Vrieze | 2 | 49 | 19.22 |