计算机科学
分割
增采样
人工智能
棱锥(几何)
编码器
卷积神经网络
特征(语言学)
变压器
计算机视觉
遥感
模式识别(心理学)
图像(数学)
地质学
语言学
哲学
物理
量子力学
电压
光学
操作系统
作者
Bin Liu,Bing Li,Victor Sreeram,Shuofeng Li
摘要
Remote sensing (RS) images play an indispensable role in many key fields such as environmental monitoring, precision agriculture, and urban resource management. Traditional deep convolutional neural networks have the problem of limited receptive fields. To address this problem, this paper introduces a hybrid network model that combines the advantages of CNN and Transformer, called MBT-UNet. First, a multi-branch encoder design based on the pyramid vision transformer (PVT) is proposed to effectively capture multi-scale feature information; second, an efficient feature fusion module (FFM) is proposed to optimize the collaboration and integration of features at different scales; finally, in the decoder stage, a multi-scale upsampling module (MSUM) is proposed to further refine the segmentation results and enhance segmentation accuracy. We conduct experiments on the ISPRS Vaihingen dataset, the Potsdam dataset, the LoveDA dataset, and the UAVid dataset. Experimental results show that MBT-UNet surpasses state-of-the-art algorithms in key performance indicators, confirming its superior performance in high-precision remote sensing image segmentation tasks.
科研通智能强力驱动
Strongly Powered by AbleSci AI