HyperCLOVA: A Korean-specific version of GPT-3 with prompt optimization
What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers
In this paper, we address some remaining issues less reported by the 82-b gpt-3 paper, such as a non-english language model (lm), the performances of different sized models, and the effect of recently introduced prompt optimization on in-context learning.
To address these issues, we introduce a korean variant of 82-b gpt-3 trained on a korean-centriccorpus of 560b tokens.
Enhanced by our korean-specific tokenization, our training configuration shows state-of-the-art in-context zero-shot and few-shot learning performances on various downstream tasks in korean.
Also, we show the performance benefits of prompt-based learning and demonstrate how it can be integrated into the prompt engineering pipeline.
Finally, we discuss the possibility of materializing the no code ai paradigm by providing interactive prompt engineering capabilities to non-experts of ml by introducing hyperclova studio, an interactive prompt engineering interface.
Finally, we demonstrate the potential of our methods with three successful in-house applications.
Authors
Boseop Kim, HyoungSeok Kim, Sang-Woo Lee, Gichang Lee, Donghyun Kwak, Dong Hyeon Jeon, Sunghyun Park, Sungju Kim, Seonhoon Kim, Dongpil Seo, Heungsub Lee, Minyoung Jeong, Sungjae Lee, Minsub Kim, Suk Hyun Ko, Seokhun Kim, Taeyong Park, Jinuk Kim, Soyoung Kang, Na-Hyeon Ryu