高光谱成像
计算机科学
人工智能
卷积神经网络
模式识别(心理学)
卷积(计算机科学)
上下文图像分类
变压器
特征提取
残余物
块(置换群论)
人工神经网络
图像(数学)
算法
数学
物理
量子力学
电压
几何学
作者
Junjie Zhang,Zhe Meng,Feng Zhao,Hanqiang Liu,Zhenhui Chang
出处
期刊:IEEE Geoscience and Remote Sensing Letters
[Institute of Electrical and Electronics Engineers]
日期:2022-01-01
卷期号:19: 1-5
被引量:66
标识
DOI:10.1109/lgrs.2022.3208935
摘要
Hyperspectral image (HSI) can provide rich spectral information which can be helpful for accurate classification in many applications. Yet, incorporating spatial information in the classification process can improve the classification accuracy even further. Existing convolutional neural network (CNN) usually only focuses on local features in hyperspectral cubes, whereas the burgeoning vision transformer (ViT) is interested in global features in HSIs. In this letter, we propose a deep aggregated framework for HSI classification called convolution transformer mixer (CTMixer) to combine the advantages of the above two paradigms effectively. A group parallel residual block is firstly applied to capture local spectral-spatial features in the HSI patches. Secondly, a double-branch structure, consisting of the CNN and transformer branches, is developed to capture local-global hyperspectral features. Finally, to achieve an elegant combination of CNN and ViT, a novel local-global multi-head self-attention mechanism is proposed by introducing convolution operations in the multi-head self-attention mechanism to further improve the classification accuracy. Extensive experiments demonstrate that the CTMixer achieves competitive classification results on several common HSI datasets compared with other state-of-the-art networks. The source code for this work will be available at https://github.com/ZJier/CTMixer.
科研通智能强力驱动
Strongly Powered by AbleSci AI