PreCM: The Padding-based Rotation Equivariant Convolution Mode for Semantic Segmentation

计算机科学卷积（计算机科学）衬垫计算机视觉旋转（数学）分割人工智能模式（计算机接口）计算机安全人工神经网络操作系统

作者

Xinyu Xu,Huazhen Liu,Tao Zhang,Huilin Xiong,Wenxian Yu

出处

期刊：IEEE transactions on image processing [Institute of Electrical and Electronics Engineers]
日期：2025-01-01 卷期号：: 1-1

链接

arxiv.org arxiv.org nih.govdoi.org

标识

DOI：10.1109/tip.2025.3558425

摘要

Semantic segmentation is an important branch of image processing and computer vision. With the popularity of deep learning, various convolutional neural networks have been proposed for pixel-level classification and segmentation tasks. In practical scenarios, however, imaging angles are often arbitrary, encompassing instances such as water body images from remote sensing and capillary and polyp images in the medical domain, where prior orientation information is typically unavailable to guide these networks to extract more effective features. In this case, learning features from objects with diverse orientation information poses a significant challenge, as the majority of CNN-based semantic segmentation networks lack rotation equivariance to resist the disturbance from orientation information. To address this challenge, this paper first constructs a universal convolutiongroup framework aimed at more fully utilizing orientation information and equipping the network with rotation equivariance. Subsequently, we mathematically design a padding-based rotation equivariant convolution mode (PreCM), which is not only applicable to multi-scale images and convolutional kernels but can also serve as a replacement component for various types of convolutions, such as dilated convolutions, transposed convolutions, and asymmetric convolution. To quantitatively assess the impact of image rotation in semantic segmentation tasks, we also propose a new evaluation metric, Rotation Difference (RD). The replacement experiments related to six existing semantic segmentation networks on three datasets (i.e., Satellite Images of Water Bodies, DRIVE, and Floodnet) show that, the average Intersection Over Union (IOU) of their PreCM-based versions respectively improve 6.91%, 10.63%, 4.53%, 5.93%, 7.48%, 8.33% compared to their original versions in terms of random angle rotation. And the average RD values are decreased by 3.58%, 4.56%, 3.47%, 3.66%, 3.47%, 3.43% respectively. The code can be download from https://github.com/XinyuXu414.

求助该文献

最长约 10秒，即可获得该文献文件

PreCM: The Padding-based Rotation Equivariant Convolution Mode for Semantic Segmentation

今日热心研友