忠诚
一致性(知识库)
计算机科学
人工智能
集合(抽象数据类型)
条件随机场
扩散
图像(数学)
度量(数据仓库)
条件作用
比例(比率)
对象(语法)
视图合成
算法
领域(数学)
计算机视觉
数学
数据挖掘
统计
电信
渲染(计算机图形)
物理
量子力学
纯数学
热力学
程序设计语言
作者
Daniel Watson,William Chan,Ricardo Martin-Brualla,Jonathan C. Ho,Andrea Tagliasacchi,Mohammad Norouzi
出处
期刊:Cornell University - arXiv
日期:2022-01-01
被引量:55
标识
DOI:10.48550/arxiv.2210.04628
摘要
We present 3DiM, a diffusion model for 3D novel view synthesis, which is able to translate a single input view into consistent and sharp completions across many views. The core component of 3DiM is a pose-conditional image-to-image diffusion model, which takes a source view and its pose as inputs, and generates a novel view for a target pose as output. 3DiM can generate multiple views that are 3D consistent using a novel technique called stochastic conditioning. The output views are generated autoregressively, and during the generation of each novel view, one selects a random conditioning view from the set of available views at each denoising step. We demonstrate that stochastic conditioning significantly improves the 3D consistency of a naive sampler for an image-to-image diffusion model, which involves conditioning on a single fixed view. We compare 3DiM to prior work on the SRN ShapeNet dataset, demonstrating that 3DiM's generated completions from a single view achieve much higher fidelity, while being approximately 3D consistent. We also introduce a new evaluation methodology, 3D consistency scoring, to measure the 3D consistency of a generated object by training a neural field on the model's output views. 3DiM is geometry free, does not rely on hyper-networks or test-time optimization for novel view synthesis, and allows a single model to easily scale to a large number of scenes.
科研通智能强力驱动
Strongly Powered by AbleSci AI