解码方法
计算机科学
水准点(测量)
单眼
人工智能
棱锥(几何)
编码(集合论)
编码器
计算机视觉
算法
模式识别(心理学)
数学
几何学
大地测量学
集合(抽象数据类型)
程序设计语言
地理
操作系统
作者
Minsoo Song,Seokjae Lim,Wonjun Kim
标识
DOI:10.1109/tcsvt.2021.3049869
摘要
With a great success of the generative model via deep neural networks, monocular depth estimation has been actively studied by exploiting various encoder-decoder architectures. However, the decoding process in most previous methods, which repeats simple up-sampling operations, probably fails to fully utilize underlying properties of well-encoded features for monocular depth estimation. To resolve this problem, we propose a simple but effective scheme by incorporating the Laplacian pyramid into the decoder architecture. Specifically, encoded features are fed into different streams for decoding depth residuals, which are defined by decomposition of the Laplacian pyramid, and corresponding outputs are progressively combined to reconstruct the final depth map from coarse to fine scales. This is fairly desirable to precisely estimate the depth boundary as well as the global layout. We also propose to apply weight standardization to pre-activation convolution blocks of the decoder architecture, which gives a great help to improve the flow of gradients and thus makes optimization easier. Experimental results on benchmark datasets constructed under various indoor and outdoor environments demonstrate that the proposed method is effective for monocular depth estimation compared to state-of-the-art models. The code and model are publicly available at: | https://github.com/tjqansthd/LapDepth-release |.
科研通智能强力驱动
Strongly Powered by AbleSci AI