强化学习
计算机科学
计算卸载
马尔可夫决策过程
能源消耗
计算
人工智能
边缘设备
数学优化
边缘计算
GSM演进的增强数据速率
马尔可夫过程
算法
云计算
生物
数学
统计
操作系统
生态学
作者
Xu Liu,Zhengyi Chai,Yalun Li,Yan-Yang Cheng,Yue Zeng
标识
DOI:10.1016/j.ins.2023.119154
摘要
Unmanned aerial vehicle-assisted multi-access edge computing (UAV-MEC) plays an important role in some complex environments such as mountainous and disaster areas. Computation offloading problem (COP) is one of the key issues of UAV-MEC, which mainly aims to minimize the conflict goals between energy consumption and delay. Due to the time-varying and uncertain nature of the UAV-MEC system, deep reinforcement learning is an effective method for solving the COP. Different from the existing works, in this paper, the COP in UAV-MEC system is modeled as a multi-objective Markov decision process, and a multi-objective deep reinforcement learning method is proposed to solve it. In the proposed algorithm, the scalar reward of reinforcement learning is expanded into a vector reward, and the weights are dynamically adjusted to meet different user preferences. The most important preferences are selected by non-dominated sorting, which can better maintain the previously learned strategy. In addition, the Q network structure combines Double Deep Q Network (Double DQN) with Dueling Deep Q Network (Dueling DQN) to improve the optimization efficiency. Simulation results show that the algorithm achieves a good balance between energy consumption and delay, and can obtain a better computation offloading scheme.
科研通智能强力驱动
Strongly Powered by AbleSci AI