Symbolic Knowledge Distillation: from General Language Models to Commonsense Models
The common practice for training commonsense models has gonefrom-human-to-corpus-to-machine : humans author commonsense knowledge graphs in order to train commonsense models.
In this work, we investigate an alternative, from-machine-to-corpus-to-machine : general language models author these commonsense knowledge graphs to train commonsense models.
As with prior art in knowledge distillation (hinton et al.
A key difference is that we distill knowledge symbolically-as text-in addition to the neural model.
We also distill only one aspect-thecommonsense of a general language model teacher, allowing the student to be a different type, a commonsense model.
Our study leads to a new framework, symbolic knowledge distillation.
We apply this to theatomic resource, and share our new symbolic knowledge graph and commonsensemodels.
Authors
Peter West, Chandra Bhagavatula, Jack Hessel, Jena D. Hwang, Liwei Jiang, Ronan Le Bras, Ximing Lu, Sean Welleck, Yejin Choi