计算机科学
杠杆(统计)
可扩展性
可控性
人工智能
绘图
机器学习
数据挖掘
数据库
计算机图形学(图像)
应用数学
数学
作者
Axel Sauer,Katja Schwarz,Andreas Geiger
标识
DOI:10.1145/3528233.3530738
摘要
Computer graphics has experienced a recent surge of data-centric approaches for photorealistic and controllable content creation. StyleGAN in particular sets new standards for generative modeling regarding image quality and controllability. However, StyleGAN’s performance severely degrades on large unstructured datasets such as ImageNet. StyleGAN was designed for controllability; hence, prior works suspect its restrictive design to be unsuitable for diverse datasets. In contrast, we find the main limiting factor to be the current training strategy. Following the recently introduced Projected GAN paradigm, we leverage powerful neural network priors and a progressive growing strategy to successfully train the latest StyleGAN3 generator on ImageNet. Our final model, StyleGAN-XL, sets a new state-of-the-art on large-scale image synthesis and is the first to generate images at a resolution of 10242 at such a dataset scale. We demonstrate that this model can invert and edit images beyond the narrow domain of portraits or specific object classes. Code, models, and supplementary videos can be found at https://sites.google.com/view/stylegan-xl/ .
科研通智能强力驱动
Strongly Powered by AbleSci AI