CoMiX: Cross-Modal Fusion with Deformable Convolutions for HSI-X Semantic Segmentation

情态动词 融合 分割 人工智能 计算机科学 计算机视觉 材料科学 语言学 哲学 复合材料
作者
Xuming Zhang,Xingfa Gu,Qingjiu Tian,Lorenzo Bruzzone
出处
期刊:Cornell University - arXiv
标识
DOI:10.48550/arxiv.2411.09023
摘要

Improving hyperspectral image (HSI) semantic segmentation by exploiting complementary information from a supplementary data type (referred to X-modality) is promising but challenging due to differences in imaging sensors, image content, and resolution. Current techniques struggle to enhance modality-specific and modality-shared information, as well as to capture dynamic interaction and fusion between different modalities. In response, this study proposes CoMiX, an asymmetric encoder-decoder architecture with deformable convolutions (DCNs) for HSI-X semantic segmentation. CoMiX is designed to extract, calibrate, and fuse information from HSI and X data. Its pipeline includes an encoder with two parallel and interacting backbones and a lightweight all-multilayer perceptron (ALL-MLP) decoder. The encoder consists of four stages, each incorporating 2D DCN blocks for the X model to accommodate geometric variations and 3D DCN blocks for HSIs to adaptively aggregate spatial-spectral features. Additionally, each stage includes a Cross-Modality Feature enhancement and eXchange (CMFeX) module and a feature fusion module (FFM). CMFeX is designed to exploit spatial-spectral correlations from different modalities to recalibrate and enhance modality-specific and modality-shared features while adaptively exchanging complementary information between them. Outputs from CMFeX are fed into the FFM for fusion and passed to the next stage for further information learning. Finally, the outputs from each FFM are integrated by the ALL-MLP decoder for final prediction. Extensive experiments demonstrate that our CoMiX achieves superior performance and generalizes well to various multimodal recognition tasks. The CoMiX code will be released.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
罗霄山完成签到,获得积分10
刚刚
襄阳发布了新的文献求助10
2秒前
蒙塔啦完成签到,获得积分10
2秒前
4秒前
共享精神应助所以采纳,获得10
5秒前
6秒前
AteeqBaloch完成签到,获得积分10
7秒前
ding应助23采纳,获得10
8秒前
三谋青年发布了新的文献求助10
9秒前
11秒前
仲夏发布了新的文献求助10
15秒前
16秒前
芒果布丁发布了新的文献求助10
17秒前
18秒前
20秒前
20秒前
款冬完成签到,获得积分10
21秒前
21秒前
zt发布了新的文献求助10
22秒前
丘比特应助科研通管家采纳,获得10
22秒前
所所应助科研通管家采纳,获得10
23秒前
桐桐应助科研通管家采纳,获得10
23秒前
田様应助科研通管家采纳,获得10
23秒前
所所应助科研通管家采纳,获得10
23秒前
23秒前
23秒前
23发布了新的文献求助10
23秒前
飘逸问薇发布了新的文献求助10
25秒前
俭朴映阳发布了新的文献求助10
26秒前
29秒前
思源应助yjia采纳,获得10
30秒前
木森ab发布了新的文献求助10
30秒前
隐形曼青应助芒果布丁采纳,获得10
30秒前
31秒前
33秒前
zt完成签到,获得积分20
33秒前
23完成签到,获得积分10
35秒前
倦鸟归林发布了新的文献求助10
35秒前
小王完成签到,获得积分10
39秒前
欣欣完成签到,获得积分10
40秒前
高分求助中
All the Birds of the World 4000
Production Logging: Theoretical and Interpretive Elements 3000
Animal Physiology 2000
Les Mantodea de Guyane Insecta, Polyneoptera 2000
Am Rande der Geschichte : mein Leben in China / Ruth Weiss 1500
CENTRAL BOOKS: A BRIEF HISTORY 1939 TO 1999 by Dave Cope 1000
Machine Learning Methods in Geoscience 1000
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3741439
求助须知:如何正确求助?哪些是违规求助? 3284100
关于积分的说明 10038416
捐赠科研通 3000937
什么是DOI,文献DOI怎么找? 1646889
邀请新用户注册赠送积分活动 783919
科研通“疑难数据库(出版商)”最低求助积分说明 750478