比例(比率)
卷积(计算机科学)
对象(语法)
计算机科学
融合
人工智能
遥感
计算机视觉
地理
地图学
语言学
哲学
人工神经网络
作者
Junjie Chen,Wei Jiang,Gang Wu,Jichang Yang,Jiandong Shang,Hengliang Guo,Dujuan Zhang,Shengguang Zhu
标识
DOI:10.20944/preprints202503.2267.v1
摘要
Affected by complex backgrounds and multi-scale object characteristics, object detection in remote sensing images faces significant challenges in accuracy. Despite advancements in the methods utilizing convolutional neural networks (CNN) and self-attention, they encounter two fundamental challenges: CNNs are restricted by their limited receptive fields, giving rise to inadequate global feature representation, whereas self-attention mechanisms, while adept at capturing long-range dependencies, suffer from heightened computational complexity that hampers practical application efficiency and may diminish the representation of local detail features. To resolve these challenges, this article proposed an innovative CNN-Mamba fusion-based detection model —MambaRetinanet— which uses a well-designed synergistic perception module (SPM) to efficiently model the global information and enhance the extraction of local features. In addition, for improving the feature pyramid network (FPN), we introduced a differentiated feature processing strategy and designed an asymmetric feature pyramid—MambFPN—based on this strategy to balance detection accuracy and computational efficiency. The experimental results indicate that MambaRetinanet has significant advantages on four mainstream remote sensing datasets: the mean Average precision (mAP) on DOTA-v1.0, DOTA-v1.5, DOTA-v2.0 and DIOR-R datasets reached 77.50, 70.21, 57.17 and 71.50 respectively, which is an average increase of 11% in comparison to that of the baseline. Notably, on the DOTA-v2.0 dataset, MambaRetinanet demonstrates advantages over the current one stage SOTA model, enhancing mAP scores by approximately 2 percentage points, thereby validating the efficacy and generalizability of the MambaRetinaNet in complex remote sensing scenarios.
科研通智能强力驱动
Strongly Powered by AbleSci AI