高光谱成像
计算机科学
人工智能
计算机视觉
模式识别(心理学)
上下文图像分类
遥感
图像(数学)
地质学
作者
Le Sun,Hang Zhang,Yuhui Zheng,Zebin Wu,Zhonglin Ye,Haixing Zhao
出处
期刊:IEEE Transactions on Geoscience and Remote Sensing
[Institute of Electrical and Electronics Engineers]
日期:2024-01-01
卷期号:62: 1-15
被引量:6
标识
DOI:10.1109/tgrs.2024.3392264
摘要
In recent years, convolutional neural networks (CNNs) have achieved remarkable success in hyperspectral image (HSI) classification tasks, primarily due to their outstanding spatial feature extraction capabilities. However, CNNs struggle to capture the diagnostic spectral information inherent in HSI. In contrast, vision transformers exhibit formidable prowess in handling spectral sequence information and excelling at capturing long-range correlations between pixels and bands. Nevertheless, due to the information loss during propagation, some existing transformer-based classification methods struggle to form sufficient spectral-spatial information mixing. To mitigate these limitations, we propose a memory-augmented spectral-spatial transformer (MASSFormer) for HSI classification. Specifically, MASSFormer incorporates two efficacious modules, the memory tokenizer (MT) and the memory-augmented transformer encoder (MATE). The former serves to transform spectral-spatial features into memory tokens for storing prior knowledge. The latter aims to extend traditional multi-head self-attention (MHSA) operations by incorporating these memory tokens, enabling ample information blending while alleviating the potential depth decay in the model, and consequently improving the model's classification performance. Extensive experiments conducted on four benchmark datasets demonstrate that the proposed method outperforms state-of-the-art methods. The source code is available at https://github.com/hz63/MASSFormer for the sake of reproducibility.
科研通智能强力驱动
Strongly Powered by AbleSci AI