数学
马尔可夫决策过程
马尔可夫链
平均成本
极限(数学)
马尔可夫模型
数学优化
马尔可夫过程
部分可观测马尔可夫决策过程
应用数学
统计
数理经济学
数学分析
经济
新古典经济学
摘要
The semi-Markov decision model is considered under the criterion of long-run average cost. A new criterion, which for any policy considers the limit of the expected cost incurred during the first n transitions divided by the expected length of the first n transitions, is considered. Conditions guaranteeing that an optimal stationary (non-randomized) policy exist are then presented. It is also shown that the above criterion is equivalent to the usual one under certain conditions.
科研通智能强力驱动
Strongly Powered by AbleSci AI