Self-Supervised Learning of Depth and Ego-Motion for 3D Perception in Human Computer Interaction

人工智能 计算机科学 计算机视觉 深度学习 卷积神经网络 杠杆(统计) 深度知觉 RGB颜色模型 感知 生物 神经科学
作者
Shanbao Qiao,Naixue Xiong,Yongbin Gao,Zhijun Fang,Wenjun Yu,Juan Zhang,Xiaoyan Jiang
出处
期刊:ACM Transactions on Multimedia Computing, Communications, and Applications [Association for Computing Machinery]
卷期号:20 (2): 1-21 被引量:4
标识
DOI:10.1145/3588571
摘要

3D perception of depth and ego-motion is of vital importance in intelligent agent and Human Computer Interaction (HCI) tasks, such as robotics and autonomous driving. There are different kinds of sensors that can directly obtain 3D depth information. However, the commonly used Lidar sensor is expensive, and the effective range of RGB-D cameras is limited. In the field of computer vision, researchers have done a lot of work on 3D perception. While traditional geometric algorithms require a lot of manual features for depth estimation, Deep Learning methods have achieved great success in this field. In this work, we proposed a novel self-supervised method based on Vision Transformer (ViT) with Convolutional Neural Network (CNN) architecture, which is referred to as ViT-Depth . The image reconstruction losses computed by the estimated depth and motion between adjacent frames are treated as supervision signal to establish a self-supervised learning pipeline. This is an effective solution for tasks that need accurate and low-cost 3D perception, such as autonomous driving, robotic navigation, 3D reconstruction, and so on. Our method could leverage both the ability of CNN and Transformer to extract deep features and capture global contextual information. In addition, we propose a cross-frame loss that could constrain photometric error and scale consistency among multi-frames, which lead the training process to be more stable and improve the performance. Extensive experimental results on autonomous driving dataset demonstrate the proposed approach is competitive with the state-of-the-art depth and motion estimation methods.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
allzzwell完成签到 ,获得积分10
2秒前
方圆完成签到 ,获得积分10
2秒前
Dsunflower完成签到 ,获得积分10
2秒前
英俊的铭应助忧伤的怜晴采纳,获得10
3秒前
量子星尘发布了新的文献求助10
4秒前
kyt_vip完成签到,获得积分10
5秒前
Bismarck完成签到,获得积分10
8秒前
basil完成签到,获得积分10
9秒前
nkr完成签到,获得积分10
10秒前
叶子完成签到 ,获得积分10
10秒前
小张完成签到 ,获得积分10
12秒前
18秒前
胖胖完成签到 ,获得积分0
19秒前
量子星尘发布了新的文献求助10
20秒前
烈阳初现发布了新的文献求助10
22秒前
尔信完成签到 ,获得积分10
22秒前
LXZ完成签到,获得积分10
23秒前
黄启烽完成签到,获得积分10
23秒前
瓦罐完成签到 ,获得积分10
26秒前
Perrylin718完成签到,获得积分10
27秒前
笨笨青筠完成签到 ,获得积分10
27秒前
量子星尘发布了新的文献求助10
28秒前
Bioflying完成签到,获得积分10
32秒前
阿达完成签到 ,获得积分10
32秒前
urologywang完成签到 ,获得积分10
33秒前
好好应助科研通管家采纳,获得10
36秒前
好好应助科研通管家采纳,获得10
36秒前
慕青应助科研通管家采纳,获得10
36秒前
科研通AI6应助科研通管家采纳,获得10
36秒前
卑微学术人完成签到 ,获得积分10
38秒前
39秒前
111111完成签到,获得积分10
40秒前
烈阳初现完成签到,获得积分10
40秒前
笑林完成签到 ,获得积分10
40秒前
谨慎的凝丝完成签到,获得积分10
42秒前
岩松完成签到 ,获得积分10
44秒前
布吉布完成签到,获得积分10
44秒前
量子星尘发布了新的文献求助10
44秒前
淡淡醉波wuliao完成签到 ,获得积分10
46秒前
Much完成签到 ,获得积分10
48秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Binary Alloy Phase Diagrams, 2nd Edition 8000
Building Quantum Computers 800
Translanguaging in Action in English-Medium Classrooms: A Resource Book for Teachers 700
Natural Product Extraction: Principles and Applications 500
Exosomes Pipeline Insight, 2025 500
Qualitative Data Analysis with NVivo By Jenine Beekhuyzen, Pat Bazeley · 2024 500
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 计算机科学 有机化学 物理 生物化学 纳米技术 复合材料 内科学 化学工程 人工智能 催化作用 遗传学 数学 基因 量子力学 物理化学
热门帖子
关注 科研通微信公众号,转发送积分 5664764
求助须知:如何正确求助?哪些是违规求助? 4869297
关于积分的说明 15108591
捐赠科研通 4823481
什么是DOI,文献DOI怎么找? 2582379
邀请新用户注册赠送积分活动 1536417
关于科研通互助平台的介绍 1494839