单眼
计算机科学
人工智能
一般化
特征(语言学)
基本事实
块(置换群论)
自编码
模式识别(心理学)
无监督学习
计算机视觉
推论
频道(广播)
人工神经网络
数学
数学分析
哲学
语言学
计算机网络
几何学
作者
Chuanwu Ling,Xiaogang Zhang,Hua Chen
标识
DOI:10.1109/tmm.2021.3091308
摘要
Monocular depth estimation has become one of the most studied topics in computer vision. Most approaches treat depth prediction as a fully supervised regression problem requiring vast amounts of corresponding ground-truth depth and image pairs for training. Unsupervised monocular depth estimation has emerged as a promising alternative that eliminates dataset limitations. This paper proposes an end-to-end unsupervised deep learning framework integrating attention blocks and multi-warp loss for monocular depth estimation. In this framework, to explore more general contextual information among the feature volumes, an attention block that sequentially refines the feature maps along the channel and spatial dimensions is inserted after the first and last stages of the network encoder. Additionally, to further utilize the errors in the original disparity estimation from the network, a novel multi-warp reconstruction strategy is designed for the loss function. The experimental results evaluated on the KITTI, CityScapes and Make3D datasets demonstrate the state-of-the-art performance and satisfactory generalization ability of our proposed method.
科研通智能强力驱动
Strongly Powered by AbleSci AI