Bit-level scaling laws for mean zero-shot performance across four datasets for 125M to 66B parameter OPT models. Zero-shot performance increases steadily for a fixed number of model bits as we reduce the quantization precision from 16 to 4 bits. At 3-bits, this relationship reverses, making 4-bit precision optimal.