计算机科学
人工智能
特征提取
激光雷达
熔块
变压器
特征学习
模式识别(心理学)
计算机视觉
高光谱成像
遥感
工程类
地理
考古
电压
电气工程
作者
Xudong Zhao,Mengmeng Zhang,Ran Tao,Wei Li,Wenzhi Liao,Lianfang Tian,Wilfried Philips
出处
期刊:IEEE transactions on neural networks and learning systems
[Institute of Electrical and Electronics Engineers]
日期:2022-07-15
卷期号:35 (2): 2314-2326
被引量:55
标识
DOI:10.1109/tnnls.2022.3189994
摘要
With the recent development of the joint classification of hyperspectral image (HSI) and light detection and ranging (LiDAR) data, deep learning methods have achieved promising performance owing to their locally sematic feature extracting ability. Nonetheless, the limited receptive field restricted the convolutional neural networks (CNNs) to represent global contextual and sequential attributes, while visual image transformers (VITs) lose local semantic information. Focusing on these issues, we propose a fractional Fourier image transformer (FrIT) as a backbone network to extract both global and local contexts effectively. In the proposed FrIT framework, HSI and LiDAR data are first fused at the pixel level, and both multisource feature and HSI feature extractors are utilized to capture local contexts. Then, a plug-and-play image transformer FrIT is explored for global contextual and sequential feature extraction. Unlike the attention-based representations in classic VIT, FrIT is capable of speeding up the transformer architectures massively and learning valuable contextual information effectively and efficiently. More significantly, to reduce redundancy and loss of information from shallow to deep layers, FrIT is devised to connect contextual features in multiple fractional domains. Five HSI and LiDAR scenes including one newly labeled benchmark are utilized for extensive experiments, showing improvement over both CNNs and VITs.
科研通智能强力驱动
Strongly Powered by AbleSci AI