强化学习
计算机科学
个性化
联合学习
约束(计算机辅助设计)
正规化(语言学)
分布式计算
人工智能
万维网
机械工程
工程类
作者
Weicheng Xiong,Quan Liu,Fanzhang Li,Bangjun Wang,Fei Zhu
标识
DOI:10.1016/j.eswa.2023.122290
摘要
Traditional federated reinforcement learning methods aim to find an optimal global policy for all agents. However, due to the heterogeneity of the environment, the optimal global policy is often only a suboptimal solution. To resolve this problem, we propose a personalized federated reinforcement learning method, named perFedDC, which aims to establish an optimal personalized policy for each agent. Our method involves creating a global model and multiple local models, using the l2-norm to measure the distance between the global model and the local model. We introduce a distance constraint as a regularization term in the update of the local model to prevent excessive policy updates. While the distance constraint can facilitate experience sharing, it is important to strike a balance between personalization and sharing appropriately. As much as possible, agents benefit from the advantages of shared experience while developing personalization. The experiments demonstrated that perFedDC was able to accelerate agent training in a stable manner while still maintaining the privacy constraints of federated learning. Furthermore, newly added agents to the federated system were able to quickly develop effective policies with the aid of convergent global policies.
科研通智能强力驱动
Strongly Powered by AbleSci AI