Yalun Wang,Shidong Chen,Huicong Bian,Weixiao Li,Qin Lu
标识
DOI:10.1109/ijcnn54540.2023.10191758
摘要
Information at different resolutions plays distinct roles in computer vision tasks. Although the research on the utilization of different resolution information in semantic segmentation has made some achievements, the research on the utilization of different resolution information in real-time semantic segmentation needs to be improved. To address this issue, we propose Deep Multi-Resolution Network (DMRNet), a lightweight model using different resolution information for real-time semantic segmentation. This model consists of several branches with different resolutions, and information is fused between neighbouring branches after convolution operations. At the end of the lowest resolution branch, we designed an enhanced semantic information module, Amplify Aggregate Pyramid Pooling Module (AAPPM), to balance the extraction of semantic information with the speed of inference. In addition, at the end of all branches, we propose a multi-resolution fusion module (MRFM) to guide the information fusion in different branches, which helps to improve the problem of spatial details being covered by semantic information. On CityScapes and Camvid, the most widely-used datasets in the field of semantic segmentation, our method strikes a balance between network accuracy and inference speed. On a single 2080Ti GPU, DMRNet achieves 77.6 % and 74.7 % accuracy at inference speeds of 68.7 FPS and 91.6 FPS, respectively.