DiT: Self-supervised Pre-training for Document Image Transformer - 42Papers