MagicMix: A Simple Yet Effective Solution for Spatio-Temporal Concept Synthesis
MagicMix: Semantic Mixing with Diffusion Models
We present a novel method for semantic mixing, aiming at synthesizing a novel concept while preserving the spatial layout and geometry.
Our method does not require any spatial mask or re-training, yet is able to synthesize novel objects with high fidelity.
Motivated by the progressive generation property of text-conditioned diffusion models, our method first obtains a coarse layout (either bycorrupting an image or denoising from a pure gaussian noise given a textprompt) followed by injection of conditional prompt for semantic mixing.
We further devise two simple strategies to provide better control and flexibility over the synthesized content.
We present our results over diverse downstream applications, including semantic style transfer, novel object synthesis, breed mixing, and concept removal, demonstrating the flexibility of our method.
Authors
Jun Hao Liew, Hanshu Yan, Daquan Zhou, Jiashi Feng
In this work, we are interested in studying a new problem termed, whose objective is to blend two different semantics (, corgi and coffee machine) in a semantic manner to create a new concept (, a corgi-alike coffee machine) while being photo-realistic.
Such a problem is challenging since even a human user might not know how is it supposed to look like.
To address this, we present a new approach termed, which is built upon existing text-conditioned image diffusion-based generative models.
Our approach is extremely simple, requiring neither re-training nor user-provided masks.
Result
In this work, we present a novel task called, whose objective is to mix two different semantics to synthesize a new unseen concept.
Our method exploits the properties of diffusion-based generative models by injecting new concepts during the denoising process.
Our approach does not require any spatial masks or re-training, while preserving the layout and geometry.
Given this, our magicmix supports several downstream applications, including semantic style transfer, novel object synthesis, breed mixing and concept removal.