现场可编程门阵列
卷积神经网络
计算机科学
硬件加速
加速度
嵌入式系统
GSM演进的增强数据速率
边缘设备
软件
计算机硬件
计算机体系结构
人工智能
云计算
物理
经典力学
操作系统
程序设计语言
作者
Panagiotis Mousouliotis,Nikolaos Tampouratzis,Ioannis Papaefstathiou
标识
DOI:10.1109/icat57854.2023.10171329
摘要
Most FPGA-based Convolutional Neural Network (CNN) hardware accelerators target the datacenter rather than edge processing units. To further fill this gap, this work presents SqueezeJet-3 and the corresponding design flow of a novel FPGA-based embedded system, consisting of software and hardware for accelerating edge CNN inference. SqueezeJet-3 is optimized for accelerating small ImageNet class CNNs, such as SqueezeNet vl.l and ZynqNet, on low-end low-cost SoC FPGA devices. SqueezeJet-3 is evaluated against the DietChai accelerator, which is part of Xilinx's ChaiDNN v2 framework, in terms of performance, resource utilization, power, and accuracy; the results demonstrate that for the acceleration of SqueezeNet vl.l, SqueezeJet-3 is better than DietChai in all categories. Our evaluation results also show that, by using the presented design framework, a developer can implement FPGA accelerators for larger CNNs, such as the VGG16, with similar performance to the accelerators designed by Angel-Eye and fpgaConvNet frameworks which are optimized for VGG16-like CNN networks.
科研通智能强力驱动
Strongly Powered by AbleSci AI