计算机科学
现场可编程门阵列
嵌入式系统
推论
高效能源利用
硬件加速
软件
能源消耗
重新使用
目标检测
计算机工程
计算机体系结构
计算机硬件
人工智能
操作系统
模式识别(心理学)
工程类
电气工程
生物
生态学
作者
Omar Eid,Mohammed A. Abd El Ghany
标识
DOI:10.1109/icm52667.2021.9664943
摘要
The high computational power of GPUs allowed for larger networks to be used in object detection applications. However, due to the huge power consumption and inefficiency when it comes to memory access and the number of bits used to represent the data, it is difficult to use them in embedded applications. Therefore, extensive research has been conducted to use FPGAs as a highly efficient substitute for GPUs to implement deep learning algorithms. As the scale and complexity of the algorithms keep increasing each year to improve their performance, it becomes even harder to implement such algorithms on an FPGA without reusing hardware resources. In this work, we implement Yolov4-tiny on a single FPGA by applying several resource sharing and optimization techniques. Our implementation shows a decrease in power consumption that ranges from 66% to 93.5% less power when compared to software. Moreover, less hardware resources and faster inference time is achieved. When comparing with the hardware implementation of networks with similar size, our design is 6.67 times faster and uses 62.5% less energy per image.
科研通智能强力驱动
Strongly Powered by AbleSci AI