强化学习
计算机科学
人工智能
交叉口(航空)
机器学习
数学优化
集合(抽象数据类型)
功能(生物学)
贝尔曼方程
数学
工程类
进化生物学
生物
程序设计语言
航空航天工程
作者
Tianmeng Hu,Biao Luo,Chunhua Yang,Tingwen Huang
标识
DOI:10.1109/tpami.2023.3283537
摘要
Deep reinforcement learning (RL) has been applied extensively to solve complex decision-making problems. In many real-world scenarios, tasks often have several conflicting objectives and may require multiple agents to cooperate, which are the multi-objective multi-agent decision-making problems. However, only few works have been conducted on this intersection. Existing approaches are limited to separate fields and can only handle multi-agent decision-making with a single objective, or multi-objective decision-making with a single agent. In this paper, we propose MO-MIX to solve the multi-objective multi-agent reinforcement learning (MOMARL) problem. Our approach is based on the centralized training with decentralized execution (CTDE) framework. A weight vector representing preference over the objectives is fed into the decentralized agent network as a condition for local action-value function estimation, while a mixing network with parallel architecture is used to estimate the joint action-value function. In addition, an exploration guide approach is applied to improve the uniformity of the final non-dominated solutions. Experiments demonstrate that the proposed method can effectively solve the multi-objective multi-agent cooperative decision-making problem and generate an approximation of the Pareto set. Our approach not only significantly outperforms the baseline method in all four kinds of evaluation metrics, but also requires less computational cost.
科研通智能强力驱动
Strongly Powered by AbleSci AI