渲染(计算机图形)
最大值和最小值
计算机科学
人工智能
可微函数
计算机视觉
代表(政治)
对象(语法)
集合(抽象数据类型)
人工神经网络
方向(向量空间)
目标检测
职位(财务)
任务(项目管理)
模式识别(心理学)
数学
经济
法学
程序设计语言
管理
几何学
数学分析
政治
政治学
财务
作者
David Griffiths,J. Boehm,Tobias Ritschel
标识
DOI:10.1109/3dv53792.2021.00062
摘要
In this paper we set out to solve the task of 6-DOF 3D object detection from 2D images, where the only supervision is a geometric representation of the objects we aim to find. In doing so, we remove the need for 6-DOF labels (i.e. position, orientation etc.), allowing our network to be trained on unlabeled images in a self-supervised manner. We achieve this through a neural network which learns an explicit scene parameterization which is subsequently passed into a differentiable renderer. We analyze why analysis-by-synthesis-like losses for supervision of 3D scene structure using differentiable rendering is not practical, as it almost always gets stuck in local minima of visual ambiguities. This can be overcome by a novel form of training, where an additional network is employed to steer the optimization itself to explore the entire parameter space i.e. to be curious, and hence, to resolve those ambiguities and find workable minima.
科研通智能强力驱动
Strongly Powered by AbleSci AI