计算机科学
推论
试验台
延迟(音频)
边缘设备
GSM演进的增强数据速率
加速度
分布式计算
最优化问题
人工智能
分拆(数论)
云计算
算法
计算机网络
电信
数学
组合数学
操作系统
物理
经典力学
作者
Fang Dong,Hui‐Tian Wang,Dian Shen,Zhaowu Huang,Qiang He,Jinghui Zhang,Liangsheng Wen,Tingting Zhang
标识
DOI:10.1109/tmc.2022.3172402
摘要
Edge intelligence, as a prospective paradigm for accelerating DNN inference, is mostly implemented by model partitioning which inevitably incurs the large transmission overhead of DNN's intermediate data. A popular solution introduces multi-exit DNNs to reduce latency by enabling early exits. However, existing work ignores the correlation between exit settings and synergistic inference, causing incoordination of device-to-edge. To address this issue, this paper first investigates the bottlenecks of executing multi-exit DNNs in edge computing and builds a novel model for inference acceleration with exit selection, model partition, and resource allocation. To tackle the intractable coupling subproblems, we propose a Multi-exit DNN inference Acceleration framework based on Multi-dimensional Optimization (MAMO). In MAMO, the exit selection subproblem is first extracted from the original problem. Then, bidirectional dynamic programming is employed to determine the optimal exit setting for an arbitrary multi-exit DNN. Finally, based on the optimal exit setting, a DRL-based policy is developed to learn joint decisions of model partition and resource allocation. We deploy MAMO on a real-world testbed and evaluate its performance in various scenarios. Extensive experiments show that it can adapt to heterogeneous tasks and dynamic networks, and accelerate DNN inference by up to 13.7x compared with the state-of-the-art.
科研通智能强力驱动
Strongly Powered by AbleSci AI