可缩放矢量图形
矢量图形
光栅图形
计算机科学
绘图
人工智能
像素
忠诚
计算机视觉
计算机图形学(图像)
万维网
电信
作者
Ajay Jain,Amber Xie,Pieter Abbeel
标识
DOI:10.1109/cvpr52729.2023.00190
摘要
Diffusion models have shown impressive results in text-to-image synthesis. Using massive datasets of captioned images, diffusion models learn to generate raster images of highly diverse objects and scenes. However, designers frequently use vector representations of images like Scalable Vector Graphics (SVGs) for digital icons or art. Vector graphics can be scaled to any size, and are compact. We show that a text-conditioned diffusion model trained on pixel representations of images can be used to generate SVG-exportable vector graphics. We do so without access to large datasets of captioned SVGs. By optimizing a differentiable vector graphics rasterizer, our method, VectorFusion, distills abstract semantic knowledge out of a pretrained diffusion model. Inspired by recent text-to-3D work, we learn an SVG consistent with a caption using Score Distillation Sampling. To accelerate generation and improve fidelity, VectorFusion also initializes from an image sample. Experiments show greater quality than prior work, and demonstrate a range of styles including pixel art and sketches.
科研通智能强力驱动
Strongly Powered by AbleSci AI