Mixture-of-experts for semantic segmentation of remoting sensing image
计算机科学
计算机视觉
分割
图像分割
人工智能
图像(数学)
自然语言处理
情报检索
作者
He Shaofeng,Cheng Qiu,Huai Yu,Zhu Zhongke,Jie Ding
标识
DOI:10.1117/12.3035091
摘要
To address the issues of complex backgrounds, imbalanced samples, and inconsistent sample scales in remote sensing image segmentation, this study proposes a remote sensing image segmentation network based on Swin Transformer (SW). The backbone network is designed based on Swin Transformer and incorporates a Mixture of Experts (MoE) structure to separate the model's parameter space. Different scales and scenes of remote sensing images activate different expert models for inference, addressing the complex backgrounds and imbalanced sample issues. By adding a channel attention module to the decoder to aggregate spatial and channel information weights of remote sensing images, the multiscale information of buildings in remote sensing images is effectively utilized, thus reducing the loss of image details during training. Experiments conducted on the public remote sensing semantic segmentation dataset NAIC validate the effectiveness of the proposed algorithm, which achieves mean Intersection over Union (mIoU) and F1 scores of 84.06 and 90.12, respectively, outperforming DeeplabV3+ by 3.44 and 3.77.