计算机科学
推论
调度(生产过程)
边缘设备
边缘计算
中央处理器
分布式计算
人工神经网络
地铁列车时刻表
人工智能
GSM演进的增强数据速率
云计算
操作系统
数学优化
数学
作者
Xiancheng Lin,Rongkai Liu,Jiajie Xie,Qian Wei,Zhi Zhou,Xu Chen,Zhilan Huang,Gang Lü
标识
DOI:10.1109/wcnc55385.2023.10118755
摘要
Edge AI is an emerging paradigm that leverages edge computing to pave the last mile delivery of artificial intelligence. To satisfy the stringent timeliness and energy-efficiency requirements of emerging edge AI tasks, specialized AI accelerator of Neural Processing Units (NPU) have been widely equipped by edge nodes. Compared to the traditional centralized processing units (CPU), NPU has better performance and energy-efficiency. However, these benefits come at the cost of reduced inference accuracy. As a result, existing coarse-grained scheduling mechanisms that schedule a whole DNN task to either the CPU or NPU are unable to make the best use of NPU. To address this issue, we propose an online NPU-CPU co-inference scheduling mechanism to schedule the DNN task at the fine-grained layer level, and thus to fully utilize the performance, accuracy, and power diversities of the NPU and CPU. By applying Lyapunov optimization to schedule the network layers dynamically, our proposed online scheduling mechanism is able to ensure the real-time inference speed and cap the long-term time-averaged power consumption, while still approximately minimizes the long-term inference accuracy loss. Via rigorous theoretical analysis as well as realistic trace-driven simulations, we demonstrate the effectiveness of our proposed online scheduling mechanism.
科研通智能强力驱动
Strongly Powered by AbleSci AI