随机博弈
概率逻辑
数学优化
马尔可夫决策过程
计算机科学
航程(航空)
马尔可夫过程
数理经济学
数学
人工智能
统计
复合材料
材料科学
作者
Krishnendu Chatterjee,Adrián Elgyütt,Petr Novotný,Owen Rouillé
标识
DOI:10.24963/ijcai.2018/652
摘要
Partially-observable Markov decision processes (POMDPs) with discounted-sum payoff are a standard framework to model a wide range of problems related to decision making under uncertainty. Traditionally, the goal has been to obtain policies that optimize the expectation of the discounted-sum payoff. A key drawback of the expectation measure is that even low probability events with extreme payoff can significantly affect the expectation, and thus the obtained policies are not necessarily risk averse. An alternate approach is to optimize the probability that the payoff is above a certain threshold, which allows to obtain risk-averse policies, but ignore optimization of the expectation. We consider the expectation optimization with probabilistic guarantee (EOPG) problem where the goal is to optimize the expectation ensuring that the payoff is above a given threshold with at least a specified probability. We present several results on the EOPG problem, including the first algorithm to solve it.
科研通智能强力驱动
Strongly Powered by AbleSci AI