计算机科学
现场可编程门阵列
算法
卷积(计算机科学)
加速度
作者
Ren-hao Cai,Li-quan Song,Peng Li,Bing-tong Zhang
摘要
The application scene of convolution neural network is more and more extensive, which can be migrated to infrared field. A convolutional layer accelerator is designed on the FPGA to meet the needs of miniaturization and low power consumption of embedded devices. The author reduces the model about 4 times by low-bit quantization,reduces the invalid calculations through padding processing,improves computing efficiency through data flow and parallel computing, effectively reduces the computation time of the convolution layer. Ultimately, taking the SSD algorithm as an example in the FPGA, the author reduces the calculation time to about one tenth of the cpu calculation time. At the same time, the decrease degree of the macro detection result mAP50(mean average precision) caused by quantification is within 3%, and the decrease degree of detection rate and false alarm rate is within 1%.
科研通智能强力驱动
Strongly Powered by AbleSci AI