计算机科学
人工智能
判别式
模式识别(心理学)
加权
情态动词
合成孔径雷达
特征(语言学)
特征提取
保险丝(电气)
图像融合
融合
杠杆(统计)
模态(人机交互)
传感器融合
计算机视觉
数据挖掘
图像(数学)
工程类
哲学
放射科
电气工程
医学
化学
高分子化学
语言学
作者
Feng Deng,Mei‐Yu Huang,Bajin Wei,Nan Ji,Xueshuang Xiang
标识
DOI:10.1007/978-981-99-8549-4_25
摘要
Land use classification using optical and Synthetic Aperture Radar (SAR) images is a crucial task in remote sensing image interpretation. Recently, deep multi-modal fusion models have significantly enhanced land use classification by integrating multi-source data. However, existing approaches solely rely on simple fusion methods to leverage the complementary information from each modality, disregarding the intermodal correlation during the feature extraction process, which leads to inadequate integration of the complementary information. In this paper, we propose FASONet, a novel multi-modal fusion network consisting of two key modules that tackle this challenge from different perspectives. Firstly, the feature alignment module (FAM) facilitates cross-modal learning by aligning high-level features from both modalities, thereby enhancing the feature representation for each modality. Secondly, we introduce the multi-modal squeeze and excitation fusion module (MSEM) to adaptively fuse discriminative features by weighting each modality and removing irrelevant parts. Our experimental results on the WHU-OPT-SAR dataset demonstrate the superiority of FASONet over other fusion-based methods, exhibiting a remarkable 5.1% improvement in MIoU compared to the state-of-the-art MCANet method.
科研通智能强力驱动
Strongly Powered by AbleSci AI