基线(sea)
无人机
水准点(测量)
计算机科学
对象(语法)
人工智能
地质学
地理
地图学
生物
海洋学
遗传学
作者
Kechen Song,Xiaogang Xue,Hongwei Wen,Yingying Ji,Yunhui Yan,Qinggang Meng
出处
期刊:IEEE transactions on intelligent vehicles
[Institute of Electrical and Electronics Engineers]
日期:2024-01-01
卷期号:: 1-12
标识
DOI:10.1109/tiv.2024.3398429
摘要
Multispectral object detection has achieved remarkable results due to its ability to fuse information from visible and thermal modalities in recent years. However, the existing visible-thermal datasets are constructed based on manually aligned image pairs, which cannot fully represent the challenges of real-world scenarios where image pairs are often misaligned. Existing methods for visible-thermal object detection are based on aligned data and are limited by the accuracy of registration. To address the above issues, we propose a dataset, namely DVTOD, which is a misaligned visible-thermal object detection dataset captured by drones. DVTOD includes 16 challenging attributes and 54 capture scenes. Furthermore, we introduce a cross-modal alignment detector (CMA-Det) for misaligned visible-thermal object detection. Firstly, we design an alignment network to estimate the visible-to-thermal deformation field, which is used to correct for misalignment of the corresponding visible and thermal features. Secondly, we propose a strategy called Object Search Rectification (OSR) to improve the robustness of feature alignment. To better remove the interference of complex backgrounds, a bi-directional feature correction fusion module (BFCFM) is designed to calibrate bimodal features by exploiting the correlation of channel and spatial information between two modalities. CMA-Det outperforms existing methods on the DVTOD dataset and two other visible-thermal object detection datasets. The dataset and code will be published at https://github.com/VDT-2048/DVTOD .
科研通智能强力驱动
Strongly Powered by AbleSci AI