Huiying Wang,Chunping Wang,Qiang Fu,Dongdong Zhang,Renke Kou,Ying Yu,Jian Song
出处
期刊:IEEE Transactions on Geoscience and Remote Sensing [Institute of Electrical and Electronics Engineers] 日期:2024-01-01卷期号:62: 1-21
标识
DOI:10.1109/tgrs.2024.3367934
摘要
Arbitrary-oriented object detection is vital for improving UAV sensing and has promising applications. However, challenges persist in detecting objects under extreme conditions like low-illumination and strong occlusion. Cross-modal feature fusion enhances detection in complex environments but current methods do not adequately learn the features of each modality for the current environment, resulting in degraded performance. To tackle this, we propose the CRSIOD network that effectively learns diverse sensor image features to capture distinct scenarios and target characteristics. Firstly, we design an illumination perception module to guide the object detection network in performing various feature processing tasks. Secondly, to leverage the respective advantages of two modalities and mitigate their negative impacts, we introduce an uncertainty aware module to quantify the uncertainties present in each modality as weights to motivate the network to learn in a direction favorable for optimal object detection. Moreover, in the object detection network, we design a two-stream backbone network based on the attention mechanism to enhance the learning of difficult samples, utilize the CMAFF module to fully extract the shared and complementary features between the two modalities, and design a three-branch feature enhancement network to enhance the learning of the three modal features separately. Finally, to optimize detection results, we design light perception non-maximum suppression and improve the horizontal detection head to a rotating one to preserve object orientation. We evaluate the proposed method CRSIOD on the Drone Vehicle dataset of public UAV aerial images. Compared with the existing commonly used methods, CRSIOD achieves state-of-the-art detection performance.