模态(人机交互)
人工智能
计算机科学
模式
模式识别(心理学)
计算机视觉
社会科学
社会学
作者
Yukang Zhang,Yan Yan,Yang Lu,Hanzi Wang
标识
DOI:10.1145/3474085.3475250
摘要
Visible-infrared person re-identification (VI-ReID) aims to search identities of pedestrians across different spectra. In this task, one of the major challenges is the modality discrepancy between the visible (VIS) and infrared (IR) images. Some state-of-the-art methods try to design complex networks or generative methods to mitigate the modality discrepancy while ignoring the highly non-linear relationship between the two modalities of VIS and IR. In this paper, we propose a non-linear middle modality generator (MMG), which helps to reduce the modality discrepancy. Our MMG can effectively project VIS and IR images into a unified middle modality image (UMMI) space to generate middle-modality (M-modality) images. The generated M-modality images and the original images are fed into the backbone network to reduce the modality discrepancy.Furthermore, in order to pull together the two types of M-modality images generated from the VIS and IR images in the UMMI space, we propose a distribution consistency loss (DCL) to make the modality distribution of the generated M-modalities images as consistent as possible. Finally, we propose a middle modality network (MMN) to further enhance the discrimination and richness of features in an explicit manner. Extensive experiments have been conducted to validate the superiority of MMN for VI-ReID over some state-of-the-art methods on two challenging datasets. The gain of MMN is more than 11.1% and 8.4% in terms of Rank-1 and mAP, respectively, even compared with the latest state-of-the-art methods on the SYSU-MM01 dataset.
科研通智能强力驱动
Strongly Powered by AbleSci AI