残余物
无人机
计算机科学
阶段(地层学)
人工智能
计算机视觉
变压器
模式识别(心理学)
算法
工程类
电压
电气工程
遗传学
生物
古生物学
作者
Suwan Wang,Suxiao Wang,Yinpeng Wan,Zhenghong Xiao
标识
DOI:10.1109/iotaai62601.2024.10692694
摘要
An improved Real-Time Detection Transformer(RTDETR) detector is proposed to address challenges in unmanned aerial vehicle (UAV) image detection, such as small and dense targets, complex backgrounds, and limited computational resources. In the backbone network, a lightweight cross-stage parallel feature extraction module (RepCSPNet block) is designed. The CSPNet Block is optimized using the RepNBottleneck residual structure to enhance the network’s capability to capture long-range contextual information while reducing computational complexity. In the encoder part, the DyScalSeq scale sequence fusion structure is used to replace the traditional Cross-Scale Feature Fusion Module method. The dynamic scale sequence feature fusion module and the global-local spatial attention feature aggregation block are used in collaboration to prevent the loss of small target feature information caused by up-sampling and down-sampling operations. This approach enriches the detailed information for small-target detection and enhances the network’s ability to fuse multi-scale features effectively. Experimental results show that on the Visdrone2019 dataset, the proposed optimized RTDETR detector achieves improvements of 1.5% in mAP metrics, whereas the GFLOPs of the model diminish by 28.8%. These results validate the effectiveness and efficiency of the proposed improvements and provide new insights and technical support for research in the field of UAV image detection.
科研通智能强力驱动
Strongly Powered by AbleSci AI