计算机科学
推论
移动计算
分布式计算
边缘计算
GSM演进的增强数据速率
移动边缘计算
计算机网络
人工智能
作者
Hui Li,Xiuhua Li,Qilin Fan,Qiang He,Xiaofei Wang,Victor C. M. Leung
标识
DOI:10.1109/tmc.2024.3357874
摘要
Model partitioning is a promising technique for improving the efficiency of distributed inference by executing partial deep neural network (DNN) models on edge servers (ESs) or Internet-of-Things (IoT) devices. However, due to heterogeneous resources of ESs and IoT devices in mobile edge computing (MEC) networks, it is non-trivial to guarantee the DNN inference speed to satisfy specific delay constraints. Meanwhile, many existing DNN models have a deep and complex architecture with numerous DNN blocks, which leads to a huge search space for fine-grained model partitioning. To address these challenges, we investigate distributed DNN inference with fine-grained model partitioning, with collaborations between ESs and IoT devices. We formulate the problem and propose a multi-task learning based asynchronous advantage actor-critic approach to find a competitive model partitioning policy that reduces DNN inference delay. Specifically, we combine the shared layers of actor-network and critic-network via soft parameter sharing, and expand the output layer into multiple branches to determine the model partitioning policy for each DNN block individually. Experiment results demonstrate that the proposed approach outperforms state-of-the-art approaches by reducing total inference delay, edge inference delay and local inference delay by an average of 4.76%, 10.04% and 8.03% in the considered MEC networks.
科研通智能强力驱动
Strongly Powered by AbleSci AI