计算机科学
水准点(测量)
特征(语言学)
行人检测
模式识别(心理学)
人工智能
图像融合
目标检测
融合
情态动词
算法
特征向量
骨干网
计算机视觉
图像(数学)
行人
计算机网络
哲学
化学
大地测量学
运输工程
高分子化学
工程类
地理
语言学
作者
Ying Sun,Zhiqiang Hou,Chen Yang,Sugang Ma,Jiulun Fan
标识
DOI:10.1007/978-3-031-47634-1_30
摘要
A dual-modal feature alignment based object detection algorithm is proposed for the full fusion of visible and infrared image features. First, we propose a two stream detection model. The algorithm supports simultaneous input of visible and infrared image pairs. Secondly, a gated fusion network is designed, consisting of a dual-modal feature alignment module and a feature fusion module. Medium-term fusion is used, which will be used as the middle layer of the dual-stream backbone network. In particular, the dual-mode feature alignment module extracts detailed information of the dual-mode aligned features by computing a multi-scale dual-mode aligned feature vector. The feature fusion module recalibrates the bimodal fused features and then multiplies them with the bimodal aligned features to achieve cross-modal fusion with joint enhancement of the lower and higher level features. We validate the performance of the proposed algorithm using both the publicly available KAIST pedestrian dataset and a self-built GIR dataset. On the KAIST dataset, the algorithm achieves an accuracy of 77.1%, which is 17.3% and 5.6% better than the accuracy of the benchmark algorithm YOLOv5-s for detecting visible and infrared images alone; on the self-built GIR dataset, the detection accuracy is 91%, which is 1.2% and 14.2% better than the benchmark algorithm for detecting visible and infrared images alone respectively. And the speed meets the real time requirements.
科研通智能强力驱动
Strongly Powered by AbleSci AI