分割
编码器
计算机科学
卷积神经网络
变压器
深度学习
特征提取
模式识别(心理学)
计算机视觉
人工智能
数据挖掘
物理
量子力学
电压
操作系统
作者
Heqing Yang,Bing Li,Haiming Liu,Shuofeng Li
标识
DOI:10.1016/j.asr.2024.06.056
摘要
In the derection of remote sensing (RS) image analysis, semantic segmentation, as an important technology, is of key significance for the identification and analysis of land surface cover types. In recent years, applying deep learning models to tasks such as road extraction, water distribution extraction, building classification and building segmentation from RS images has become an important research hotspot. Due to its limited receptive field, traditional convolutional neural networks (CNN) cannot effectively capture global context information. Transformer uses the multi-head self-attention mechanism to capture a wide range of information and can solve this problem well. Therefore, we proposed ST-MDAMNet based on Swin Transformer and combined with the multi-dimensional attention mechanism. First, a feature enhancement module (FAM) is introduced after each stage of the Swin Transformer encoder to effectively enhance the model's proficiency in identifying essential information. Secondly, a feature fusion module (FFM) is proposed to effectively fuse the multi-scale information of the encoder part. It further improves the expression ability of different dimensional features and effectively improves the detection effect of small targets. Ultimately, the fused features are input into the multi-dimensional attention module (MDAM) to carefully optimize the features, which greatly increases the effect of semantic segmentation of RS images. We demonstrate the effectiveness of each module through ablation experiments. Comparative experiments are completed on two publicly large-scale datasets, and the proposed method shows excellent results compared with state-of-the-art methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI