The popularity and widespread use of pruning and quantization is driven by
the severe resource constraints of deploying deep neural networks to
environments with strict latency, memory and energy requirements. These
techniques achieve high levels of compression with negligible impact on
top-line metrics (top-1 and top-5 accuracy). However, overall accuracy hides
disproportionately high errors on a small subset of examples; we call this
subset Compression Identified Exemplars (CIE). We further establish that for
CIE examples, compression amplifies existing algorithmic bias. Pruning
disproportionately impacts performance on underrepresented features, which
often coincides with considerations of fairness. Given that CIE is a relatively
small subset but a great contributor of error in the model, we propose its use
as a human-in-the-loop auditing tool to surface a tractable subset of the
dataset for further inspection or annotation by a domain expert. We provide
qualitative and quantitative support that CIE surfaces the most challenging
examples in the data distribution for human-in-the-loop auditing.
Authors
Sara Hooker, Nyalleng Moorosi, Gregory Clark, Samy Bengio, Emily Denton