DiffDreamer: An unsupervised framework for generating long-range novel views
DiffDreamer: Consistent Single-view Perpetual View Generation with Conditional Diffusion Models
We introduce an unsupervised framework capable of synthesizing novel viewsdepicting a long camera trajectory while training solely on internet-collected images of nature scenes.
We demonstrate that image-conditioned diffusion models can effectively perform long-range scene extrapolation while preserving both local and global consistency significantly better than prior gan-based methods.
Authors
Shengqu Cai, Eric Ryan Chan, Songyou Peng, Mohamad Shahbazi, Anton Obukhov, Luc Van Gool, Gordon Wetzstein
Consistent perpetual view generation is extremely challenging as it tries to tackle two difficult tasks simultaneously : consistent single-view novel view synthesis (nvs) and long-range extrapolation.
Given a single image and a long camera trajectory, the goal of is to synthesize a multiview-consistent 3d scene along the camera trajectory.
In other words, we want to teach a machine to hallucinate content when flying into the image while maintaining multiview consistency, thereby extrapolating the scene realistically.
Successfully addressing this task opens up a wide range of potential applications in virtual reality, 3d content creation, synthetic data creation, and 3d viewing platforms.