计算机科学
深度学习
推论
单眼
人工智能
符号
集合(抽象数据类型)
图像(数学)
姿势
计算机工程
机器学习
程序设计语言
数学
算术
作者
Matteo Poggi,Fabio Tosi,Filippo Aleotti,Stefano Mattoccia
出处
期刊:IEEE Transactions on Intelligent Transportation Systems
[Institute of Electrical and Electronics Engineers]
日期:2022-03-14
卷期号:23 (10): 17342-17353
被引量:10
标识
DOI:10.1109/tits.2022.3157265
摘要
Single-image depth estimation represents a longstanding challenge in computer vision and although it is an ill-posed problem, deep learning enabled astonishing results leveraging both supervised and self-supervised training paradigms. State-of-the-art solutions achieve remarkably accurate depth estimation from a single image deploying huge deep architectures, requiring powerful dedicated hardware to run in a reasonable amount of time. This overly demanding complexity makes them unsuited for a broad category of applications requiring devices with constrained resources or memory consumption. To tackle this issue, in this paper a family of compact, yet effective CNNs for monocular depth estimation is proposed, by leveraging self-supervision from a binocular stereo rig. Our lightweight architectures, namely PyD-Net and PyD-Net2, compared to complex state-of-the-art trade a small drop in accuracy to drastically reduce the runtime and memory requirements by a factor ranging from $2\times $ to $100\times $ . Moreover, our networks can run real-time monocular depth estimation on a broad set of embedded or consumer devices, even not equipped with a GPU, by early stopping the inference with negligible (or no) loss in accuracy, making it ideally suited for real applications with strict constraints on hardware resources or power consumption.
科研通智能强力驱动
Strongly Powered by AbleSci AI