|
CONVERGENCE OF CONTROLLED MODELS FOR CONTINUOUS-TIME MARKOV DECISION PROCESSES WITH CONSTRAINED AVERAGE CRITERIA |
Wenzhao Zhang,Xianzhu Xiong |
(College of Math. and Computer Science, Fuzhou University, Fuzhou 350108, Fujian, PR China) |
DOI: |
Abstract: |
This paper attempts to study the convergence of optimal values and optimal policies of continuous-time Markov decision processes (CTMDP for short) under the constrained average criteria. For a given original model $\mathcal{M}_\infty$ of \mbox{CTMDP} with denumerable states and a sequence $\{\mathcal{M}_n\}$ of CTMDP with finite states, we give a new convergence condition to ensure that the optimal values and optimal policies of $\{\mathcal{M}_n\}$ converge to the optimal value and optimal policy of $\mathcal{M}_\infty$ as the state space $S_n$ of $\mathcal{M}_n$ converges to the state space $S_\infty$ of $\mathcal{M}_\infty$, respectively. The transition rates and cost/reward functions of $\mathcal{M}_\infty$ are allowed to be unbounded. Our approach can be viewed as a combination method of linear program and Lagrange multipliers. |
Key words: continuous-time Markov decision processes; optimal value; optimal policies; constrained average criteria; occupation measures |