人工智能
RGB颜色模型
计算机科学
计算机视觉
探测器
目标检测
模式识别(心理学)
模态(人机交互)
卷积神经网络
特征(语言学)
卷积(计算机科学)
融合
透视图(图形)
特征选择
对象(语法)
人工神经网络
电信
语言学
哲学
作者
Tianyi Zhao,Maoxun Yuan,Xingxing Wei
出处
期刊:Cornell University - arXiv
日期:2024-01-01
标识
DOI:10.48550/arxiv.2401.10731
摘要
Object detection in visible (RGB) and infrared (IR) images has been widely applied in recent years. Leveraging the complementary characteristics of RGB and IR images, the object detector provides reliable and robust object localization from day to night. Existing fusion strategies directly inject RGB and IR images into convolution neural networks, leading to inferior detection performance. Since the RGB and IR features have modality-specific noise, these strategies will worsen the fused features along with the propagation. Inspired by the mechanism of human brain processing multimodal information, this work introduces a new coarse-to-fine perspective to purify and fuse two modality features. Specifically, following this perspective, we design a Redundant Spectrum Removal module to coarsely remove interfering information within each modality and a Dynamic Feature Selection module to finely select the desired features for feature fusion. To verify the effectiveness of the coarse-to-fine fusion strategy, we construct a new object detector called Removal and Selection Detector (RSDet). Extensive experiments on three RGB-IR object detection datasets verify the superior performance of our method.
科研通智能强力驱动
Strongly Powered by AbleSci AI