Prompt-based Multitask Benchmarking for Large Language Models - 42Papers