BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
We present bloom, a 176 b-parameter open-access language model designed and built thanks to acollaboration of hundreds of researchers.
Bloom is a decoder-only transformerlanguage model that was trained on the roots corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total).
We find that bloom achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning.
Authors
Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ilić, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, Jonathan Tow, Alexander M. Rush, Stella Biderman, Albert Webson, Pawan Sasanka Ammanamanchi, Thomas Wang, Benoît Sagot, Niklas Muennighoff, Albert Villanova del Moral, Olatunji Ruwase
We present the bigscience large open-science open-access multilingual language model (bloom), a 176 billion parameter language model trained on 46 natural languages and 13 programming languages that was developed and released by a collaboration of hundreds of researchers.
Our overall aim is not only to publicly release a large-scale multilingual language model with performance comparable to recently developed systems, but also to document the coordinated process that went into its development.
The purpose of this paper is to provide a high-level overview of these design steps while referencing the individual reports we produced over the course of developing blooom.
Result
We present the release of bloom, a 176 b-parameter open-access multilingual language model.
It was created by bigscience, a collaboration of hundreds of researchers, and was trained on the french government-funded jean zay supercomputer for 3.5 months.
In this paper, we chronicled the development of its training dataset roots to the design of its architecture and tokenizer.
We also discuss evaluation results of bloom and other large language models, finding it has competitive performance that improves after multitask finetuning.