Yan Zhang,Lei Xu,Qian Hu,Chang Xu,Wen Yang,Gui-Song Xia
出处
期刊:IEEE Transactions on Geoscience and Remote Sensing [Institute of Electrical and Electronics Engineers] 日期:2024-01-01卷期号:62: 1-15被引量:1
标识
DOI:10.1109/tgrs.2024.3386735
摘要
Thermal infrared (TIR) object detection plays a crucial role in diverse around-the-clock applications, such as search and rescue operations and wildlife protection. Achieving rapid and robust detection of small objects from an aerial perspective is particularly significant in these scenarios. However, the task is compounded by two interrelated challenges, rendering it even more tricky. For one, small objects only occupy a few pixels and contain limited information. For another, TIR sensors are typically low-resolution (LR) due to inherent challenges associated with the imaging mechanism of the TIR spectrum. In contrast, high-resolution (HR) RGB sensors are readily available due to their cost-effectiveness and widespread application. Recognizing the importance of HR information, especially in the context of small object detection, we propose a cross-modality high-resolution knowledge distillation framework (CMHRD), which leverages knowledge from the HR-RGB modality and provides a novel strategy for TIR small object detection. The proposed framework introduces three key components: a super-resolution generative distillation loss for cross-modal high-resolution representation learning, a cross-modality affinity distillation loss to extract scene-level cross-modality information, and a response distillation loss aimed at mimicking the HR prediction. To facilitate research on small object detection with HR-RGB and LR-TIR data, we have curated and annotated two datasets, namely NOAA-Seal and VTUAV-det-small. Experimental results on the NOAA-Seal demonstrate that CMHRD yields significant improvements, achieving a remarkable 6.39 mAP50 increase over a strong baseline without introducing additional computational cost during inference. Experiments on single-category dataset VTUAV-det-small and multi-category dataset RTDOD also show consistent improvements brought by CMHRD. The project is available at https://github.com/NNNNerd/CMHRD.