计算机科学
人工智能
高光谱成像
模式识别(心理学)
卷积神经网络
高斯分布
特征提取
计算
核(代数)
块(置换群论)
算法
数学
几何学
量子力学
组合数学
物理
作者
Chao Ma,Minjie Wan,Jian Wu,Xiaofang Kong,Ajun Shao,Fan Wang,Qian Chen,Guohua Gu
出处
期刊:IEEE Transactions on Instrumentation and Measurement
[Institute of Electrical and Electronics Engineers]
日期:2023-01-01
卷期号:72: 1-12
被引量:12
标识
DOI:10.1109/tim.2023.3279922
摘要
In recent years, convolutional neural networks (CNNs) have been widely used in hyperspectral image (HSI) classification due to their exceptional performance in local feature extraction. However, due to the local join and weight sharing properties of the convolution kernel, CNNs have limitations in long-distance modeling, and deeper networks tend to increase computational costs. To address these issues, this paper proposes a vision Transformer (VIT) based on the light self-Gaussian-attention (LSGA) mechanism, which extracts global deep semantic features. Firstly, the hybrid spatial-spectral Tokenizer module extracts shallow spatial-spectral features and expands image patches to generate Tokens. Next, the light self-attention uses Q (Query), X (Origin input), and X instead of Q, K (Key), and V (Value) to reduce the computation and parameters. Furthermore, to avoid the lack of location information resulting in the aliasing of central and neighborhood features, we devise Gaussian absolute position bias to simulate HSI data distribution and make the attention weight closer to the central query block. Several experiments verify the effectiveness of the proposed method, which outperforms state-of-the-art methods on four datasets. Specifically, we observed a 0.62% accuracy improvement over A2S2K and a 0.11% improvement over SSFTT. In conclusion, the proposed LSGA-VIT method demonstrates promising results in HSI classification and shows potential in addressing the issues of location-aware long-distance modeling and computational cost. Our codes are available at https://github.com/machao132/LSGA-VIT.
科研通智能强力驱动
Strongly Powered by AbleSci AI