Cross-Modality Image Matching Network With Modality-Invariant Feature Representation for Airborne-Ground Thermal Infrared and Visible Datasets

人工智能计算机科学遥感模式识别（心理学）模态（人机交互）计算机视觉不变（物理）判别式代表（政治）特征（语言学）特征学习数学语言学政治地质学哲学数学物理法学政治学

作者

Song Cui,Ailong Ma,Yuting Wan,Yanfei Zhong,Bin Luo,Miaozhong Xu

出处

期刊：IEEE Transactions on Geoscience and Remote Sensing [Institute of Electrical and Electronics Engineers]
日期：2021-08-04 卷期号：60: 1-14 被引量：31

标识

DOI：10.1109/tgrs.2021.3099506

摘要

Thermal infrared (TIR) remote-sensing imagery can allow objects to be imaged clearly at night through the long-wave infrared, so that the fusion of thermal infrared and visible (VIS) imagery is a way to improve the remote-sensing interpretation ability. However, due to the large radiation difference between the two kinds of images, it is very difficult to match them. One of the most important issues is the lack of comprehensive consideration of the modality-specific information and modality-shared information, which makes it difficult for the existing methods to obtain a modality-invariant feature representation. In this article, a cross-modality image matching network, which we refer to as CMM-Net, is proposed to realize thermal infrared and visible image matching by learning a modality-invariant feature representation. First, in order to extract the modality-specific features of the imagery, the framework constructs a shallow two-branch network to make full use of the modality-specific information, without sharing parameters. Second, in order to extract the high-level semantic information between the different modalities, modality-shared layers are embedded into the deep layers of the network. In addition, three novel loss functions are designed and combined to learn the modality-invariant feature representation, that is, the discriminative loss of the non-corresponding features in the same modality, the cross-modality loss of the corresponding features between different modalities, and the cross-modality triplet (CMT) loss. The multimodal matching experiments conducted with ground- and airborne-based thermal infrared images and visible images showed that the proposed method outperforms the existing image matching methods by about 2% and 6% for the ground and airborne images, respectively.

求助该文献

Cross-Modality Image Matching Network With Modality-Invariant Feature Representation for Airborne-Ground Thermal Infrared and Visible Datasets

今日热心研友