Sharpness-aware Quantization for Deep Neural Networks
In this paper, we devise a sharpness-aware quantization (saq) method to train quantized models, leading to better generalization performance of the models.
Since each layer contributes differentlyto the loss value and the loss sharpness of a network, we further devise an effective method that learns a configuration generator to automatically determine the bitwidth configurations of each layer, encouraging lower bits for flat regions and vice versa for sharp landscapes, while simultaneously promoting the flatness of minima to enable more aggressive quantization.
Extensive experiments on cifar-100 and imagenet show the superior performanceof the proposed methods.