Generating SVG-Exportable Vector Graphics from Pixel Representations of Images
VectorFusion: Text-to-SVG by Abstracting Pixel-Based Diffusion Models
Diffusion models have shown impressive results in text-to-image synthesis.
However, designers frequently use vector representations of images like scalable vector graphics (svgs) for digital icons or art.
We show that a text-conditioned diffusion model trained on pixelrepresentations of images can be used to generate svg-exportable vector graphics without access to large datasets of captioned svgs.
By optimizing a differentiable vector graphics rasterizer, our method,vectorfusion, distills abstract semantic knowledge out of a pretrained diffusion model.
Experiments show greater quality than prior work, and demonstrate a range of styles including pixel art and sketches.
Vector graphics are the defacto format for exporting graphic designs since they can be rendered at arbitrarily high resolution on user devices, yet are stored and transmitted with a compact size, often only tens of kilobytes.
However, designing vector graphics is difficult, requiring knowledge of professional design tools.
In this work, we provide a method for generating high quality abstract vector graphics from text captions.
We start by evaluating a two phase text-to-image and image-to-vector baseline: generating a raster image with a pretrained diffusion model, then vectorizing it.
To improve quality of the svg and coherence with the caption, we incorporate the pretrained text-to -image diffusion model in an optimization loop.
Our approach, vectorfusion, combines a differentiable vector graphics renderer and a recently proposed (sds)loss to iteratively refine shape parameters.