8-bit Optimizers via Block-wise Quantization - 42Papers