LION: Latent Point Diffusion Models for 3D Shape Generation
Denoising diffusion models (DDMs) have shown promising results in 3D point
cloud synthesis. To advance 3D DDMs and make them useful for digital artists,
we require (i) high generation quality, (ii) flexibility for manipulation and
applications such as conditional synthesis and shape interpolation, and (iii)
the ability to output smooth surfaces or meshes. To this end, we introduce the
hierarchical Latent Point Diffusion Model (LION) for 3D shape generation. LION
is set up as a variational autoencoder (VAE) with a hierarchical latent space
that combines a global shape latent representation with a point-structured
latent space. For generation, we train two hierarchical DDMs in these latent
spaces. The hierarchical VAE approach boosts performance compared to DDMs that
operate on point clouds directly, while the point-structured latents are still
ideally suited for DDM-based modeling. Experimentally, LION achieves
state-of-the-art generation performance on multiple ShapeNet benchmarks.
Furthermore, our VAE framework allows us to easily use LION for different
relevant tasks: LION excels at multimodal shape denoising and voxel-conditioned
synthesis, and it can be adapted for text- and image-driven 3D generation. We
also demonstrate shape autoencoding and latent shape interpolation, and we
augment LION with modern surface reconstruction techniques to generate smooth
3D meshes. We hope that LION provides a powerful tool for artists working with
3D shapes due to its high-quality generation, flexibility, and surface
reconstruction. Project page and code: this https URL
Authors
Xiaohui Zeng, Arash Vahdat, Francis Williams, Zan Gojcic, Or Litany, Sanja Fidler, Karsten Kreis