A Bayesian Characterization of Relative Entropy
John C. Baez, Tobias Fritz
We give a new characterization of relative entropy, also known as the
Kullback-Leibler divergence. We use a number of interesting categories related
to probability theory. In particular, we consider a category FinStat where an
object is a finite set equipped with a probability distribution, while a
morphism is a measure-preserving function $f: X \to Y$ together with a
stochastic right inverse $s: Y \to X$. The function $f$ can be thought of as a
measurement process, while s provides a hypothesis about the state of the
measured system given the result of a measurement. Given this data we can
define the entropy of the probability distribution on $X$ relative to the
"prior" given by pushing the probability distribution on $Y$ forwards along
$s$. We say that $s$ is "optimal" if these distributions agree. We show that
any convex linear, lower semicontinuous functor from FinStat to the additive
monoid $[0,\infty]$ which vanishes when $s$ is optimal must be a scalar
multiple of this relative entropy. Our proof is independent of all earlier
characterizations, but inspired by the work of Petz.