自编码                        
                
                                
                        
                            计算机科学                        
                
                                
                        
                            透视图(图形)                        
                
                                
                        
                            噪音(视频)                        
                
                                
                        
                            扩散                        
                
                                
                        
                            生成语法                        
                
                                
                        
                            生成模型                        
                
                                
                        
                            功能(生物学)                        
                
                                
                        
                            计算                        
                
                                
                        
                            可扩展性                        
                
                                
                        
                            人工智能                        
                
                                
                        
                            应用数学                        
                
                                
                        
                            马尔可夫链                        
                
                                
                        
                            人工神经网络                        
                
                                
                        
                            数学优化                        
                
                                
                        
                            理论计算机科学                        
                
                                
                        
                            算法                        
                
                                
                        
                            机器学习                        
                
                                
                        
                            数学                        
                
                                
                        
                            图像(数学)                        
                
                                
                        
                            物理                        
                
                                
                        
                            数据库                        
                
                                
                        
                            进化生物学                        
                
                                
                        
                            生物                        
                
                                
                        
                            热力学                        
                
                        
                    
                    
            出处
            
                                    期刊:Cornell University - arXiv
                                                                        日期:2022-01-01
                                                                        被引量:95
                                
         
        
    
            
            标识
            
                                    DOI:10.48550/arxiv.2208.11970
                                    
                                
                                 
         
        
                
            摘要
            
            Diffusion models have shown incredible capabilities as generative models; indeed, they power the current state-of-the-art models on text-conditioned image generation such as Imagen and DALL-E 2. In this work we review, demystify, and unify the understanding of diffusion models across both variational and score-based perspectives. We first derive Variational Diffusion Models (VDM) as a special case of a Markovian Hierarchical Variational Autoencoder, where three key assumptions enable tractable computation and scalable optimization of the ELBO. We then prove that optimizing a VDM boils down to learning a neural network to predict one of three potential objectives: the original source input from any arbitrary noisification of it, the original source noise from any arbitrarily noisified input, or the score function of a noisified input at any arbitrary noise level. We then dive deeper into what it means to learn the score function, and connect the variational perspective of a diffusion model explicitly with the Score-based Generative Modeling perspective through Tweedie's Formula. Lastly, we cover how to learn a conditional distribution using diffusion models via guidance.
         
            
 
                 
                
                    
                    科研通智能强力驱动
Strongly Powered by AbleSci AI