强化学习
马尔可夫决策过程
计算机科学
维数之咒
数学优化
基线(sea)
分布式计算
运筹学
马尔可夫过程
人工智能
风险分析(工程)
工程类
数学
医学
统计
海洋学
地质学
作者
Dongkyu Lee,Junho Song
标识
DOI:10.1016/j.ress.2023.109512
摘要
Lifeline systems such as transportation and water distribution networks may deteriorate with age, raising the risk of system failure or degradation. Thus, system-level sequential decision-making is essential to address the problem cost-effectively while minimizing the potential loss. Researchers proposed to assess the risk of lifeline systems using Markov Decision Processes (MDPs) to identify a risk-informed operation and maintenance (O&M) policy. In complex systems with many components, however, it is potentially intractable to find MDP solutions because the number of states and action spaces increases exponentially. This paper proposes a multi-agent deep reinforcement learning framework termed parallelized multi-agent Deep Q-Network (PM-DQN) to overcome the curse of dimensionality. The proposed method takes a divide-and-conquer strategy, in which multiple subsystems are identified by community detection, and each agent learns to achieve the O&M policy of the corresponding subsystem. The agents establish policies to minimize the decentralized cost of the cluster unit, including the factorized cost. Such learning processes occur simultaneously in several parallel units, and the trained policies are periodically synchronized with the best ones, thereby improving the master policy. Numerical examples demonstrate that the proposed method outperforms baseline policies, including conventional maintenance schemes and the subsystem-level optimal policy.
科研通智能强力驱动
Strongly Powered by AbleSci AI