计算机科学
分割
人工智能
卷积神经网络
变压器
像素
遥感
解析
高分辨率
图像分辨率
模式识别(心理学)
计算机视觉
地理
物理
量子力学
电压
作者
Yicheng Qiao,Wei Liu,Bin Liang,Pengyun Wang,Haopeng Zhang,Junli Yang
出处
期刊:IEEE Aerospace Conference
日期:2023-03-04
卷期号:: 1-6
被引量:2
标识
DOI:10.1109/aero55745.2023.10115761
摘要
With the development of remote sensing, semantic segmentation of high-resolution remote sensing images (RSIs) is increasingly essential. At the same time, the characteristics of objects in RSIs, such as large size, variation in object scales, and complex details, make it necessary to capture both long-range context and local information. There are some methods such as Fully Convolutional Networks (FCN) and Pyramid Scene Parsing Network (PSPNet) lack the ability to capture long-range dependencies, due to the limited receptive field of Convolutional Neural Network (CNN). However, the self-attention mechanism to capture the correlation between pixels in Transformer models has remarkable capability in capturing long-range context. One of the most outstanding Transformer models is the Masked-attention Mask Transformer (Mask2Former) which adopts the mask classification method. We propose a model SeMask-Mask2Former with boundary loss. Semantically Masked (Se-Mask) is the model's backbone and Mask2Former is the decoder. Concretely, the mask classification that generates one or even more masks for specific categories to perform the elaborate segmentation is especially suitable for handling the characteristic of large within-class and small inter-class variance of RSIs. Above all, extensive experimental results show that SeMask-Mask2Former obtains better results in semantic segmentation of high-resolution RSIs on the ISPRS Potsdam dataset compared to CNN-based methods and other state-of-the-art transformer-based methods. Extensive ablation studies conducted on the Potsdam dataset verifies the contribution of each component or optimization strategy in SeMask-Mask2Former.
科研通智能强力驱动
Strongly Powered by AbleSci AI