Large-scale Knowledge Distillation with Elastic Heterogeneous Computing Resources - 42Papers