A/B/n Testing with Control in the Presence of Subpopulations

Yoan Russac, Christina Katsimerou, Dennis Bohle, Olivier Cappé, Aurélien Garivier, Wouter Koolen

Motivated by A/B/n testing applications, we consider a finite set of
distributions (called \emph{arms}), one of which is treated as a
\emph{control}. We assume that the population is stratified into homogeneous
subpopulations. At every time step, a subpopulation is sampled and an arm is
chosen: the resulting observation is an independent draw from the arm
conditioned on the subpopulation. The quality of each arm is assessed through a
weighted combination of its subpopulation means. We propose a strategy for
sequentially choosing one arm per time step so as to discover as fast as
possible which arms, if any, have higher weighted expectation than the control.
This strategy is shown to be asymptotically optimal in the following sense: if
$\tau_\delta$ is the first time when the strategy ensures that it is able to
output the correct answer with probability at least $1-\delta$, then
$\mathbb{E}[\tau_\delta]$ grows linearly with $\log(1/\delta)$ at the exact
optimal rate. This rate is identified in the paper in three different settings:
(1) when the experimenter does not observe the subpopulation information, (2)
when the subpopulation of each sample is observed but not chosen, and (3) when
the experimenter can select the subpopulation from which each response is
sampled. We illustrate the efficiency of the proposed strategy with numerical
simulations on synthetic and real data collected from an A/B/n experiment.