Data Selection Based Fine-tuning Pipeline for Image Classification
Improved Fine-tuning by Leveraging Pre-training Data: Theory and Practice
Recent studies have empirically shown that training from scratch has the final performance that is no worse than this pre-training strategy once the number of training iterations is increased in some vision tasks.
In this work, we revisit this phenomenon from the perspective of generalization analysis which is popular in learning theory.
Our result reveals that the final prediction precision may have a weak dependency on the pre-trained model, especially in the case of large training iterations.
With the insight of the theoretical finding, we propose a novel selection strategy to select a subset from pre-training data to help improve the generalization on the targettask.
Extensive experimental results for image classification tasks on 8 benchmark data sets verify the effectiveness of the proposed data selection-based fine-tuning pipeline.
Authors
Ziquan Liu, Yi Xu, Yuanhong Xu, Qi Qian, Hao Li, Antoni Chan, Rong Jin