现场可编程门阵列
计算机科学
卷积神经网络
硬件加速
量化(信号处理)
工具链
工作流程
目标检测
边缘设备
人工智能
嵌入式系统
计算机体系结构
计算机视觉
模式识别(心理学)
云计算
软件
操作系统
数据库
作者
Richard W. Yarnell,Md Sanzid Bin Hossain,Ronald F. DeMara
标识
DOI:10.1109/isqed57927.2023.10129324
摘要
Until recently, FPGA-based acceleration of convolutional neural networks (CNNs) has remained an open-ended research problem. Herein, we evaluate one new method for rapidly implementing CNNs using industry-standard frameworks within Xilinx UltraScale+ FPGA devices. Within this workflow, referred to as Framework for Accelerating YOLO-Based ML on Edge-devices (FAYME), a TensorFlow model of the You Only Look Once version 4 (YOLOv4) object detection algorithm is realized using Xilinx’s Vitis AI toolchain. We test various levels of model bit-quantization and evaluate performance while simultaneously analyzing the utilization of available memory and processing elements. We also implement a ResNet-50 model to provide additional comparisons. In this paper, we present our YOLO model, which achieves a mAP of 0.581, and our ResNet model, which achieves a Top-5 accuracy of 0.950. Furthermore, we demonstrate that these results are possible while utilizing less than 25% of the throughput offered by a single hardware accelerator in an UltraScale+ FPGA.
科研通智能强力驱动
Strongly Powered by AbleSci AI