计算机科学
超图
分割
人工智能
语义学(计算机科学)
模式识别(心理学)
数学
程序设计语言
离散数学
作者
Qibin He,Xian Sun,Wenhui Diao,Zhiyuan Yan,Fanglong Yao,Kun Fu
标识
DOI:10.1109/tip.2023.3245324
摘要
Multi-modal remote sensing (RS) image segmentation aims to comprehensively utilize multiple RS modalities to assign pixel-level semantics to the studied scenes, which can provide a new perspective for global city understanding. Multi-modal segmentation inevitably encounters the challenge of modeling intra- and inter-modal relationships, i.e., object diversity and modal gaps. However, the previous methods are usually designed for a single RS modality, limited by the noisy collection environment and poor discrimination information. Neuropsychology and neuroanatomy confirm that the human brain performs the guiding perception and integrative cognition of multi-modal semantics through intuitive reasoning. Therefore, establishing a semantic understanding framework inspired by intuition to realize multi-modal RS segmentation becomes the main motivation of this work. Drived by the superiority of hypergraphs in modeling high-order relationships, we propose an intuition-inspired hypergraph network (I2HN) for multi-modal RS segmentation. Specifically, we present a hypergraph parser to imitate guiding perception to learn intra-modal object-wise relationships. It parses the input modality into irregular hypergraphs to mine semantic clues and generate robust mono-modal representations. In addition, we also design a hypergraph matcher to dynamically update the hypergraph structure from the explicit correspondence of visual concepts, similar to integrative cognition, to improve cross-modal compatibility when fusing multi-modal features. Extensive experiments on two multi-modal RS datasets show that the proposed I2HN outperforms the state-of-the-art models, achieving F1/mIoU accuracy 91.4%/82.9% on the ISPRS Vaihingen dataset, and 92.1%/84.2% on the MSAW dataset. The complete algorithm and benchmark results will be available online.
科研通智能强力驱动
Strongly Powered by AbleSci AI