We describe a novel lossy compression approach called DiffC which is based on
unconditional diffusion generative models. Unlike modern compression schemes
which rely on transform coding and quantization to restrict the transmitted
information, DiffC relies on the efficient communication of pixels corrupted by
Gaussian noise. We implement a proof of concept and find that it works
surprisingly well despite the lack of an encoder transform, outperforming the
state-of-the-art generative compression method HiFiC on ImageNet 64x64. DiffC
only uses a single model to encode and denoise corrupted pixels at arbitrary
bitrates. The approach further provides support for progressive coding, that
is, decoding from partial bit streams. We perform a rate-distortion analysis to
gain a deeper understanding of its performance, providing analytical results
for multivariate Gaussian data as well as initial results for general
distributions. Furthermore, we show that a flow-based reconstruction achieves a
3 dB gain over ancestral sampling at high bitrates.
Authors
Lucas Theis, Tim Salimans, Matthew D. Hoffman, Fabian Mentzer