数据流
计算机科学
现场可编程门阵列
调度(生产过程)
计算机体系结构
嵌入式系统
并行计算
运营管理
经济
作者
Yi Wan,Xianzhong Xie,Junfan Chen,Kunpeng Xie,Dezhi Yi,Ye Lu,Keke Gai
标识
DOI:10.1016/j.future.2024.04.038
摘要
Lightweight convolutional neural networks (CNNs) enable lower inference latency and data traffic, facilitating deployment on resource-constrained edge devices such as field-programmable gate arrays (FPGAs). However, CNNs inference requires access to off-chip synchronous dynamic random-access memory (SDRAM), which significantly degrades inference speed and system power efficiency. In this paper, we propose an adaptive dataflow scheduling method for lightweight CNN accelerator on FPGAs named ADS-CNN. The key idea of ADS-CNN is to efficiently utilize on-chip resources and reduce the amount of SDRAM access. To achieve the reuse of logical resources, we design a time division multiplexing calculation engine to be integrated in ADS-CNN. We implement a configurable module for the convolution controller to adapt to the data reuse of different convolution layers, thus reducing the off-chip access. Furthermore, we exploit on-chip memory blocks as buffers based on the configuration of different layers in lightweight CNNs. On the resource-constrained Intel CycloneV SoC 5CSEBA6 FPGA platform, we evaluated six common lightweight CNN models to demonstrate the performance advantages of ADS-CNN. The evaluation results indicate that, compared with accelerators that use traditional tiling strategy dataflow, our ADS-CNN can achieve up to 1.29× speedup with the overall dataflow scale compression of 23.7%.
科研通智能强力驱动
Strongly Powered by AbleSci AI