计算机科学
人工智能
激光雷达
模式识别(心理学)
概率逻辑
特征提取
高光谱成像
特征学习
上下文图像分类
卷积神经网络
特征(语言学)
遥感
数据挖掘
地理
图像(数学)
哲学
语言学
作者
Kexing Ding,Ting Lu,Wei Fu,Shutao Li,Fuyan Ma
出处
期刊:IEEE Transactions on Geoscience and Remote Sensing
[Institute of Electrical and Electronics Engineers]
日期:2022-01-01
卷期号:60: 1-13
被引量:52
标识
DOI:10.1109/tgrs.2022.3216319
摘要
Hyperspectral images (HSI) contain rich spatial and spectral detail information, while light detection and ranging (LiDAR) data can provide the elevation information. Thus, the fusion of HSI and LiDAR data can help for more accurate image classification, which becomes a hot research topic. However, it is difficult to capture complex local and global spatial-spectral associations, meanwhile, how to build an effective interaction between multi-modal data is another important issue. To this end, a novel global-local transformer network (GLT-Net) is proposed for the joint classification of HSI and LiDAR data, in this paper. The main idea is to fully exploit the advantage of the convolution operator in characterizing locally correlated features and the promising capability of transformer architecture in learning long-range dependencies. Moreover, multi-scale feature fusion and probabilistic decision fusion strategies are also designed in one framework, in order to further improve classification performance. Here, the proposed GLT-Net mainly consists of multi-scale local spatial feature learning, global spectral feature learning, and global-local feature fusion classification. In specific, multi-modal image cubes of different sizes are firstly extracted and sent into convolutional neural networks (CNNs) to learn local spatial features, which is followed by multi-modal information propagation and spatial-attention guided multi-scale feature fusion. Afterwards, by considering spectral feature channels from a sequential perspective, vision transformers are introduced to model the global spectral dependencies. Finally, multiple class estimations based on local and global features are integrated via a probabilistic decision fusion strategy. In this way, complementary information of multi-modal data as well as local/global spectral-spatial information can be fully mined and jointly utilized. Extensive experiments on three popular HSI and LiDAR datasets demonstrate that the proposed method performs superiority over state-of-the-art methods. The source code of the proposed method will be made publicly available at https://github.com/Ding-Kexin/GLT-Net.
科研通智能强力驱动
Strongly Powered by AbleSci AI