人工智能
计算机科学
目标检测
特征(语言学)
计算机视觉
模式识别(心理学)
红外线的
融合
对象(语法)
特征提取
图像融合
图像(数学)
哲学
语言学
物理
光学
标识
DOI:10.1109/smc53992.2023.10394151
摘要
Fusion techniques are frequently utilized in the realm of multimodal object detection tasks. While many current studies showcase their proficiency in generating visually pleasing fused images, only a limited number of them focused on the object detection performance. This study addresses the issue by presenting an end-to-end framework for object detection through the fusion of visible and infrared features (VIFF). Specifically, our approach involves the use of two distinct processing units that independently extract features from visible and infrared images, followed by the fusion of these features using a novel fusion strategy. While the visible feature processing unit preserves the direction of the gradient of visible images, the infrared feature processing unit focuses on extracting the contrast and semantic features of infrared images. Both features are aggregated by attention mechanisms and then fed into the backbone of the object detection networks. Our fusion network achieved superior object detection accuracy compared to existing state-of-the-art approaches on various datasets. We have also demonstrated that the proposed visible feature and infrared feature processing units are capable of enhancing the performance of various object detection models.
科研通智能强力驱动
Strongly Powered by AbleSci AI