计算机科学
目标检测
人工智能
特征提取
编码器
卷积神经网络
深度学习
模式识别(心理学)
特征(语言学)
计算机视觉
语言学
操作系统
哲学
作者
Mengyuan Li,Changqing Cao,Zhejun Feng,Xiangkai Xu,Zengyan Wu,Shubing Ye,Jiawei Yong
出处
期刊:IEEE Geoscience and Remote Sensing Letters
[Institute of Electrical and Electronics Engineers]
日期:2023-01-01
卷期号:20: 1-5
被引量:8
标识
DOI:10.1109/lgrs.2023.3236777
摘要
Remote sensing object detection has been an important and challenging research hot spot in computer vision that is widely used in military and civilian fields. Recently, the combined detection model of convolutional neural network (CNN) and transformer has achieved good results, but the problem of poor detection performance of small objects still needs to be solved urgently. This letter proposes a deformable end-to-end object detection with transformers (DETR)-based framework for object detection in remote sensing images. First, multiscale split attention (MSSA) is designed to extract more detailed feature information by grouping. Next, we propose multiscale deformable prescreening attention (MSDPA) mechanism in decoding layer, which achieves the purpose of prescreening, so that the encoder–decoder structure can obtain attention map more efficiently. Finally, the A–D loss function is applied to the prediction layer, increasing the attention of small objects and optimizing the intersection over union (IOU) function. We conduct extensive experiments on the DOTA v1.5 dataset and the HRRSD dataset, which show that the reconstructed detection model is more suitable for remote sensing objects, especially for small objects. The average detection accuracy in DOTA dataset has improved by 4.4% (up to 75.6%), especially the accuracy of small objects has raised by 5%.
科研通智能强力驱动
Strongly Powered by AbleSci AI