计算机科学
嵌入
人工智能
单眼
基本事实
离群值
背景(考古学)
特征(语言学)
分层数据库模型
模式识别(心理学)
数据挖掘
语言学
生物
哲学
古生物学
作者
Lijun Wang,Jianming Zhang,Yifan Wang,Huchuan Lu,Xiang Ruan
标识
DOI:10.1007/978-3-030-58558-7_19
摘要
This paper proposes a hierarchical loss for monocular depth estimation, which measures the differences between the prediction and ground truth in hierarchical embedding spaces of depth maps. In order to find an appropriate embedding space, we design different architectures for hierarchical embedding generators (HEGs) and explore relevant tasks to train their parameters. Compared to conventional depth losses manually defined on a per-pixel basis, the proposed hierarchical loss can be learned in a data-driven manner. As verified by our experiments, the hierarchical loss even learned without additional labels can capture multi-scale context information, is more robust to local outliers, and thus delivers superior performance. To further improve depth accuracy, a cross level identity feature fusion network (CLIFFNet) is proposed, where low-level features with finer details are refined using more reliable high-level cues. Through end-to-end training, CLIFFNet can learn to select the optimal combinations of low-level and high-level features, leading to more effective cross level feature fusion. When trained using the proposed hierarchical loss, CLIFFNet sets a new state of the art on popular depth estimation benchmarks.
科研通智能强力驱动
Strongly Powered by AbleSci AI