Escaping the Big Data Paradigm with Compact Transformers - 42Papers