计算机科学
人工智能
卷积神经网络
冗余(工程)
块(置换群论)
模式识别(心理学)
光学(聚焦)
突出
卷积(计算机科学)
计算机视觉
人工神经网络
物理
几何学
数学
光学
操作系统
作者
Yijin Li,Jiaxin Wang,Si-Bao Chen,Jin Tang,Bin Luo
标识
DOI:10.1117/1.jrs.17.016517
摘要
Remote sensing image scene classification has been widely researched with the aim of assigning semantics labels to the land cover. Although convolutional neural networks (CNN), such as VggNet and ResNet, have achieved good performance, the complex background and redundant information of remote sensing images restrict the improvement of final accuracy. We propose an enhanced multihead self-attention block network, which effectively reduces the adverse impact of background and emphasize the main information. In this model, due to the possible redundancy of high-level information of CNN, we only replace the final three bottleneck blocks of ResNet50 with the enhanced multihead self-attention layer to focus on the salient region of each image more effectively. Our enhanced multihead self-attention layer provides the following improvements over the classical module. First, we construct a triple-way convolution to deal with the arbitrary directionality of remote sensing images and get more stable attention information. Then, the improved relative position encodings are used to consider the relative distance between different location features. Finally, we use depthwise convolution and InstanceNorm operation to restore the diversity ability of multiheads. The contrast and ablation experiments carried out on three public datasets show our approach improves upon the baseline significantly and achieves remarkable performance compared with some state-of-the-art methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI