计算机科学
现场可编程门阵列
吞吐量
量化(信号处理)
卷积神经网络
帧速率
块(置换群论)
卷积(计算机科学)
帧(网络)
块大小
计算机硬件
高效能源利用
硬件加速
人工神经网络
计算机工程
人工智能
算法
几何学
钥匙(锁)
无线
工程类
电气工程
电信
计算机安全
数学
作者
Justin Knapheide,Benno Stabernack,Maximilian Kuhnke
标识
DOI:10.1109/fpl50879.2020.00053
摘要
Convolutional Neural Networks are widely applied to various computer vision tasks. For most of these applications, high throughput and energy efficiency are top priorities. MobileNetV2 features very low memory requirements as well as a relatively small model size. On the ILSVRC 2012 classification challenge, it provides a decent prediction accuracy of 71.7 percent at low computational requirements. We present an FPGA based MobileNetV2 accelerator with a high throughput of 1050 frames per second at a power consumption of 34 watt under full load. This equates to a power efficiency of 32 milli-joule per frame. We describe our approach of using stream interfaces and auto-generated control signals to enable fast design of flexible architectures. By using quantization techniques, limiting the accuracy of the used number format to a 16 bit fixed point format, we were able to reduce the memory usage for weights as well as activations by a factor of two. Since the basic building block of MobileNetV2 can be used to build higher performance networks as well, the findings of this paper remain applicable, when higher prediction accuracies are required.
科研通智能强力驱动
Strongly Powered by AbleSci AI