计算机科学
模态(人机交互)
分割
人工智能
特征(语言学)
稳健性(进化)
特征提取
模式识别(心理学)
语义鸿沟
计算机视觉
图像(数学)
图像检索
语言学
生物化学
基因
哲学
化学
作者
Shengyu Xiao,Peijin Wang,Wenhui Diao,Xuee Rong,Xuexue Li,Kun Fu,Xian Sun
出处
期刊:IEEE Transactions on Geoscience and Remote Sensing
[Institute of Electrical and Electronics Engineers]
日期:2023-01-01
卷期号:61: 1-18
被引量:3
标识
DOI:10.1109/tgrs.2023.3334471
摘要
The rapid development of satellite platforms has yielded copious and diverse multi-source data for earth observation, greatly facilitating the growth of multimodal semantic segmentation (MSS) in remote sensing. However, MSS also suffers from numerous challenges: 1) Existing inherent defects in each modality due to the different imaging mechanisms. 2) Insufficient exploration of the intrinsic characteristics of modalities. 3) The existence of the huge semantic gap between heterogeneous data causes difficulties in feature fusion. The inability to effectively utilize the rich and diverse information provided by each modality and ignorance of the heterogeneity between modalities will hinder the feature enhancement, and further significantly impacts the semantic segmentation accuracy. Furthermore, neglecting the huge gap makes feature fusion challenging. In this study, we introduce a novel framework for multimodal semantic segmentation that effectively mitigates the aforementioned problems. Our approach employs a pseudo-siamese structure for feature extraction. Specifically, we propose a simple yet effective geometric topology structure modeling (GTSM) module to extract geometric relationships and texture information from optical data. Additionally, we present a modality intrinsic noise suppression (MINS) module to fully exploit radiation information and alleviate the effects of unique geometric distortions for SAR. Furthermore, we present an adaptive multimodal feature fusion (AMFF) module for fully fusing different modality features. Extensive experiments on both WHU-OPT-SAR and DFC23 datasets validate the robustness and effectiveness of the proposed Modality Characteristics-Guided Semantic Segmentation (MoCG) network compared to other state-of-the-art semantic segmentation methods, including multimodal and single-modal approaches. Our approach achieves the best performance on both datasets, resulting in mIoU/OA gains 69.1%/87.5% on WHU-OPT-SAR and 86.7%/97.3% on DFC23.
科研通智能强力驱动
Strongly Powered by AbleSci AI