人工智能
计算机科学
棱锥(几何)
计算机视觉
分割
联营
特征(语言学)
比例(比率)
卷积(计算机科学)
图像分辨率
特征提取
模式识别(心理学)
数学
地理
地图学
人工神经网络
几何学
哲学
语言学
作者
Maoke Yang,Kun Yu,Chi Zhang,Zhiwei Li,Kuiyuan Yang
出处
期刊:Computer Vision and Pattern Recognition
日期:2018-06-01
被引量:1085
标识
DOI:10.1109/cvpr.2018.00388
摘要
Semantic image segmentation is a basic street scene understanding task in autonomous driving, where each pixel in a high resolution image is categorized into a set of semantic labels. Unlike other scenarios, objects in autonomous driving scene exhibit very large scale changes, which poses great challenges for high-level feature representation in a sense that multi-scale information must be correctly encoded. To remedy this problem, atrous convolution[14]was introduced to generate features with larger receptive fields without sacrificing spatial resolution. Built upon atrous convolution, Atrous Spatial Pyramid Pooling (ASPP)[2] was proposed to concatenate multiple atrous-convolved features using different dilation rates into a final feature representation. Although ASPP is able to generate multi-scale features, we argue the feature resolution in the scale-axis is not dense enough for the autonomous driving scenario. To this end, we propose Densely connected Atrous Spatial Pyramid Pooling (DenseASPP), which connects a set of atrous convolutional layers in a dense way, such that it generates multi-scale features that not only cover a larger scale range, but also cover that scale range densely, without significantly increasing the model size. We evaluate DenseASPP on the street scene benchmark Cityscapes[4] and achieve state-of-the-art performance.
科研通智能强力驱动
Strongly Powered by AbleSci AI