Improving Worst-Group Accuracy with Large Pre-Training Datasets - 42Papers