计算机科学
马尔可夫决策过程
计算卸载
移动边缘计算
强化学习
动态规划
服务器
分布式计算
最优化问题
边缘计算
带宽(计算)
GSM演进的增强数据速率
计算机网络
马尔可夫过程
算法
人工智能
统计
数学
作者
Xuemei Yang,Hong Luo,Yan Sun,Mohsen Guizani
出处
期刊:IEEE Internet of Things Journal
[Institute of Electrical and Electronics Engineers]
日期:2022-07-06
卷期号:9 (23): 24065-24078
被引量:5
标识
DOI:10.1109/jiot.2022.3188928
摘要
Applications consisting of a group of modular tasks can be offloaded to the multiaccess edge computing (MEC) for lower delay and energy consumption. In a dynamic MEC system, the fine-grained cooperative and dynamic offloading solution is necessary for the scenario of reusing tasks among devices. Considering the transmission cooperation, shared wireless bandwidth, and changing task queues on devices and edge servers, we formulate a joint offloading optimization problem to minimize the long-term average task execution cost. Although deep reinforcement learning (DRL) is a popular method for the dynamic problem, existing DRL algorithms are not suitable for our problem because of the hybrid discrete-continuous action spaces and constraints among action dimensions. Therefore, we propose a hybrid average reward proximal policy optimization (hybrid-ARPPO) algorithm to jointly optimize the offloading decisions, cooperative transmission ratios, and edge server assignments. First, we decompose our offloading problem into two subproblems. One is a tractable linear programming problem for continuous transmission ratios, and the other is a Markov decision process (MDP) only with discrete actions for offloading decisions and server assignments. Second, we take the expected average reward as the performance measure and deprecate the discount factor, which can reduce the work of tuning algorithms. Third, we design an action mask layer in the policy network of hybrid-ARPPO to filter invalid actions. Extensive experiments show the effectiveness of our hybrid-ARPPO in different system scales and task arrival patterns.
科研通智能强力驱动
Strongly Powered by AbleSci AI