发布文献求助

Understanding Diffusion Models: A Unified Perspective

自编码计算机科学透视图（图形）噪音（视频）扩散生成语法生成模型功能（生物学）计算可扩展性人工智能应用数学马尔可夫链人工神经网络数学优化理论计算机科学算法机器学习数学图像（数学）物理数据库进化生物学生物热力学

作者

Calvin Luo

出处

期刊：Cornell University - arXiv 日期：2022-01-01 被引量：69

链接

arxiv.org arxiv.org arxiv.org datacite.orgdoi.org

标识

DOI：10.48550/arxiv.2208.11970

摘要

Diffusion models have shown incredible capabilities as generative models; indeed, they power the current state-of-the-art models on text-conditioned image generation such as Imagen and DALL-E 2. In this work we review, demystify, and unify the understanding of diffusion models across both variational and score-based perspectives. We first derive Variational Diffusion Models (VDM) as a special case of a Markovian Hierarchical Variational Autoencoder, where three key assumptions enable tractable computation and scalable optimization of the ELBO. We then prove that optimizing a VDM boils down to learning a neural network to predict one of three potential objectives: the original source input from any arbitrary noisification of it, the original source noise from any arbitrarily noisified input, or the score function of a noisified input at any arbitrary noise level. We then dive deeper into what it means to learn the score function, and connect the variational perspective of a diffusion model explicitly with the Score-based Generative Modeling perspective through Tweedie's Formula. Lastly, we cover how to learn a conditional distribution using diffusion models via guidance.

求助该文献

最长约 10秒，即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI

我的文献求助列表浏览历史

一分钟了解求助规则 | 捐赠本站 | 历史今天

更新

2025年影响因子查询已上线 (2025-6-18)

更新

PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台，具备全网最快的应助速度，最高的求助完成率。对每一个文献求助，科研通都将尽心尽力，给求助人一个满意的交代。

实时播报: 冲冲冲完成签到，获得积分10

刚刚; lotu_fr完成签到，获得积分10

1秒前; 田様的应助被SYY采纳，获得10

2秒前; zhishiyumi发布了新的文献求助10

2秒前; 吴学仕完成签到，获得积分10

2秒前; Owen的应助被guojing1321采纳，获得10

3秒前; 小马甲的应助被xiaoxiao采纳，获得10

3秒前; 小蘑菇的应助被selena采纳，获得50

3秒前; 俊逸的代曼完成签到，获得积分10

4秒前; 熔岩巨兽墨菲特完成签到，获得积分10

4秒前; 谈理想完成签到，获得积分10

4秒前; 右右发布了新的文献求助10

5秒前; leisure发布了新的文献求助10

5秒前; ECT完成签到，获得积分10

5秒前; 坚强枫发布了新的文献求助30

5秒前; 闪电侠完成签到，获得积分10

6秒前; 南宫清涟发布了新的文献求助20

6秒前; hhh完成签到，获得积分10

6秒前; 木心的应助被王木木采纳，获得20

6秒前; axn发布了新的文献求助10

7秒前; NexusExplorer的应助被Yosemite采纳，获得10

7秒前; 111完成签到，获得积分10

8秒前; FashionBoy上传了应助文件

8秒前; 曾经的臻完成签到，获得积分10

8秒前; 在水一方上传了应助文件

8秒前; 系统提示完成签到，获得积分10

8秒前; Chen完成签到，获得积分10

8秒前; JinGN完成签到，获得积分10

9秒前; 田様上传了应助文件

9秒前; Vaibhav完成签到，获得积分10

10秒前; 星辰大海的应助被图图搞科研采纳，获得10

10秒前; hhh发布了新的文献求助10

10秒前; 脑洞疼上传了应助文件

11秒前; 科研通AI2S上传了应助文件

11秒前; 哦哟发布了新的文献求助30

11秒前; Bio的应助被123采纳，获得50

11秒前; ccl完成签到，获得积分10

12秒前; sidra完成签到，获得积分10

12秒前; Chen发布了新的文献求助10

12秒前; 可爱的函函上传了应助文件

12秒前

高分求助中: A new approach to the extrapolation of accelerated life test data 1000; ‘Unruly’ Children: Historical Fieldnotes and Learning Morality in a Taiwan Village (New Departures in Anthropology) 400; Indomethacinのヒトにおける経皮吸収 400; Phylogenetic study of the order Polydesmida (Myriapoda: Diplopoda) 370; 基于可调谐半导体激光吸收光谱技术泄漏气体检测系统的研究 330; Aktuelle Entwicklungen in der linguistischen Forschung 300; Current Perspectives on Generative SLA - Processing, Influence, and Interfaces 300

热门求助领域（近24小时）

热门帖子: 关注科研通微信公众号，转发送积分 3986618; 求助须知：如何正确求助？哪些是违规求助？ 3529071; 关于积分的说明 11243225; 捐赠科研通 3267556; 什么是DOI，文献DOI怎么找？ 1803784; 邀请新用户注册赠送积分活动 881185; 科研通“疑难数据库（出版商）”最低求助积分说明 808582

今日热心研友

玫瑰窃贼（情绪稳定版）

昏睡的蟠桃

注：热心度 = 本日应助数 + 本日被采纳获取积分÷10

Copyright © 2020-2025 AbleSci.COM, 科研通, All Right Reserved

科研通是非营利科研互助平台，不忘初心，为科研助力

本站互助的所有文件仅供个人学习研究用，禁止任何人把求助的所得文献进行盈利或传播

皖ICP备2024041134号-1

皖公网安备34019202002308

科研通【文献互助QQ群】：如果您有特殊求助，或发布求助超过24小时未得到应助，可加群求助，群号：941272744【点击一键加群】

科研通【志愿服务QQ群】：如果您热爱文献互助，有热心愿意为更多人服务，请加入小伙伴群，点击申请加入

关注微信服务号

科研通