判别式
计算机科学
人工智能
卷积神经网络
遥感
稳健性(进化)
接头(建筑物)
模式识别(心理学)
上下文图像分类
计算机视觉
特征提取
变压器
图像(数学)
工程类
地质学
基因
电气工程
生物化学
电压
建筑工程
化学
作者
Peifang Deng,Kejie Xu,Huixiao Hong
出处
期刊:IEEE Geoscience and Remote Sensing Letters
[Institute of Electrical and Electronics Engineers]
日期:2022-01-01
卷期号:19: 1-5
被引量:40
标识
DOI:10.1109/lgrs.2021.3109061
摘要
Scene classification is an indispensable part of remote sensing image interpretation, and various convolutional neural network (CNN)-based methods have been explored to improve classification accuracy. Although they have shown good classification performance on high-resolution remote sensing (HRRS) images, discriminative ability of extracted features is still limited. In this letter, a high-performance joint framework combined CNNs and vision transformer (ViT) (CTNet) is proposed to further boost the discriminative ability of features for HRRS scene classification. The CTNet method contains two modules, including the stream of ViT (T-stream) and the stream of CNNs (C-stream). For the T-stream, flattened image patches are sent into pretrained ViT model to mine semantic features in HRRS images. To complement with T-stream, pretrained CNN is transferred to extract local structural features in the C-stream. Then, semantic features and structural features are concatenated to predict labels of unknown samples. Finally, a joint loss function is developed to optimize the joint model and increase the intraclass aggregation. The highest accuracies on the aerial image dataset (AID) and Northwestern Polytechnical University (NWPU)-RESISC45 datasets obtained by the CTNet method are 97.70% and 95.49%, respectively. The classification results reveal that the proposed method achieves high classification performance compared with other state-of-the-art (SOTA) methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI