Towards High-accuracy and Real-time Two-stage Small Object Detection on FPGA

计算机科学 现场可编程门阵列 目标检测 人工智能 计算机视觉 对象(语法) 实时计算 模式识别(心理学) 嵌入式系统
作者
S Y Li,Zhenhua Zhu,Hanbo Sun,Xuefei Ning,Guohao Dai,Yiming Hu,Huazhong Yang,Yu Wang
出处
期刊:IEEE Transactions on Circuits and Systems for Video Technology [Institute of Electrical and Electronics Engineers]
卷期号:34 (9): 8053-8066
标识
DOI:10.1109/tcsvt.2024.3385121
摘要

Object detection via deep neural networks has undergone considerable advancements in recent years. Yet, the detection of smaller objects, specifically those with a few pixels (i.e., < 32 2 pixels), is still challenging compared with large objects (i.e., > 96 2 pixels). Existing methods commonly apply high-resolution features or complex super-resolution strategies based on the two-stage Faster Region Convolutional Neural Network (RCNN). They sequentially apply localization and classification stages after a shared feature map extracted by one single backbone network. However, these methods cause low detection accuracy of small objects, high computational overhead, and waste of hardware resources. In this paper, we develop a high-accuracy and real-time small object detection system with negligible computational overhead and low hardware idleness. At the software level, we propose a two-stage Coarse-to-Fine Decoupling RCNN (CFD RCNN) with three techniques: (1) The shared backbone decoupling for localization and classification to achieve high accuracy for both tasks; (2) The training method using backbone feature upsampling for localization with low computational overhead; (3) The object cropping strategy from the original high-resolution image for high-accuracy classification. At the hardware level, we propose a virtualized FPGA accelerator with the Dynamic Resource Allocation (DRA) strategy. The DRA strategy reallocates the hardware resources, considering the workload and resource preference of each stage in CFD RCNN to reduce hardware idleness. Extensive experiments on the TT100K and GTSDB datasets using Xilinx ZCU102 FPGA show that the proposed small object detection system can achieve 2.9% improvement in mean average precision (mAP) compared with state-of-the-art (SOTA) algorithms and raised the throughput from 18.9 FPS to > 26.0 FPS (~1.37×) compared with existing accelerators.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
共享精神应助米糊采纳,获得10
刚刚
爆米花应助米糊采纳,获得10
刚刚
无花果应助米糊采纳,获得10
刚刚
1秒前
小马甲应助米糊采纳,获得10
1秒前
852应助米糊采纳,获得10
1秒前
旺旺小面包完成签到 ,获得积分10
1秒前
1秒前
天天快乐应助米糊采纳,获得10
1秒前
星辰大海应助米糊采纳,获得10
1秒前
ZKF完成签到,获得积分10
1秒前
搜集达人应助米糊采纳,获得10
1秒前
1秒前
2秒前
阔达菠萝发布了新的文献求助10
2秒前
遠啊发布了新的文献求助10
3秒前
3秒前
chencc发布了新的文献求助10
3秒前
honne发布了新的文献求助10
3秒前
3秒前
端庄梦松发布了新的文献求助10
4秒前
费费完成签到,获得积分10
4秒前
小次之山发布了新的文献求助50
4秒前
4秒前
领导范儿应助slowride采纳,获得10
4秒前
dddd发布了新的文献求助10
4秒前
5秒前
天天快乐应助朱荧荧采纳,获得10
5秒前
子铭发布了新的文献求助10
6秒前
彩虹完成签到,获得积分10
6秒前
6秒前
Hfrgbxfjcff完成签到,获得积分20
7秒前
yortory发布了新的文献求助10
7秒前
7秒前
英俊的铭应助小胡采纳,获得10
7秒前
luoluo发布了新的文献求助10
7秒前
7秒前
Shaun发布了新的文献求助10
8秒前
搞怪平蓝发布了新的文献求助10
8秒前
费费发布了新的文献求助10
8秒前
高分求助中
卤化钙钛矿人工突触的研究 1000
Engineering for calcareous sediments : proceedings of the International Conference on Calcareous Sediments, Perth 15-18 March 1988 / edited by R.J. Jewell, D.C. Andrews 1000
Wolffs Headache and Other Head Pain 9th Edition 1000
Continuing Syntax 1000
Signals, Systems, and Signal Processing 510
Cardiac structure and function of elite volleyball players across different playing positions 500
CLSI H26-A2 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6242022
求助须知:如何正确求助?哪些是违规求助? 8065936
关于积分的说明 16834777
捐赠科研通 5320067
什么是DOI,文献DOI怎么找? 2832935
邀请新用户注册赠送积分活动 1810458
关于科研通互助平台的介绍 1666837