期刊:IEEE Transactions on Geoscience and Remote Sensing [Institute of Electrical and Electronics Engineers] 日期:2023-01-01卷期号:61: 1-16被引量:10
标识
DOI:10.1109/tgrs.2023.3294424
摘要
The Hyperspectral Image (HSI) classification aims to assign each pixel to a land cover category. It is receiving increasing attention from both industry and academia. The main challenge lies in capturing reliable and informative spatial and spectral dependencies concealed in the HSI for each class. To address the challenge, we propose a Spatial-Spectral 1DSwin Transformer with Group-wise Feature Tokenization (SS1DSwin) for HSI classification. Specifically, we reveal local and hierarchical spatial-spectral relationships from two different perspectives. It mainly consists of a Group-wise Feature Tokenization Module (GFTM) and a 1DSwin Transformer with Cross-block Normalized Connection Module (TCNCM). For GFTM, we reorganize an image patch into overlapping cubes, and further generate group-wise token embeddings with Multi-head Self-Attention (MSA) to learn the local spatial-spectral relationship along the spatial dimension. For TCNCM, we adopt the shifted windowing strategy when acquiring the hierarchical spatial-spectral relationship along the spectral dimension with 1D Window based Multi-head Self-Attention (1DW-MSA) and 1D Shifted Window based Multi-head Self-Attention (1DSW-MSA), and leverage Cross-block Normalized Connection (CNC) to adaptively fuse the feature maps from different blocks. In SS1DSwin, we apply these two modules in order and predict the class label for each pixel. To test the effectiveness of the proposed method, extensive experiments are conducted on four HSI datasets, and the results indicate that SS1DSwin outperforms several current state-of-the-art methods. The source code of the proposed method is available at https://github.com/Minato252/SS1DSwin.