计算机科学
多光谱图像
行人检测
人工智能
情态动词
特征提取
模式识别(心理学)
RGB颜色模型
同质性(统计学)
传感器融合
目标检测
计算机视觉
机器学习
行人
工程类
化学
运输工程
高分子化学
作者
Ruimin Li,Jiajun Xiang,Feixiang Sun,Ye Yuan,Longwu Yuan,Shuiping Gou
标识
DOI:10.1109/tmm.2023.3272471
摘要
Multispectral pedestrian detection has shown many advantages in a variety of environments, particularly poor illumination conditions, by leveraging visible-thermal modalities. However, in-depth insight into distinguishing the complementary content of multimodal data and exploring the extent of multimodal feature fusion is still lacking. In this paper, we propose a novel multispectral pedestrian detector with multiscale cross-modal homogeneity enhancement and confidence-aware feature fusion. RGB and thermal streams are constructed to extract features and generate candidate proposals. During feature extraction, multiscale cross-modal homogeneity enhancement is proposed to enhance single-modal features using the separated homogeneous features via modal interactions. Homogeneity features encode the semantic information of the scene and are extracted from the RGB-thermal pairs by employing a channel attention mechanism. Proposals from two modalities are united to obtain multimodal proposals. Then, confidence measurement fusion is proposed to achieve multispectral feature fusion in each proposal by measuring the internal confidence of each modality and the interaction confidence between modalities. In addition, a confidence transfer loss function is designed to focus more on hard-to-detect samples during training. Experimental results on two challenging datasets demonstrate that the proposed method achieves better performance compared to existing methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI