Controllability and Robustness in Large Language Models
Large Language Models with Controllable Working Memory
We undertake a joint study of controllability and robustness in the context of large language models (llm).
We demonstrate that state-of-the-art state-of-the-art t5 and palm (both pretrained and finetuned) could exhibit poor controllability and robustness, which do not scale with increasing model size.
Our comprehensive evaluation showcases the utility of kaft across model architectures and sizes.
Authors
Daliang Li, Ankit Singh Rawat, Manzil Zaheer, Xin Wang, Michal Lukasik, Andreas Veit, Felix Yu, Sanjiv Kumar
It is well known that large language models memorize large amounts of factual knowledge in their parameters, which could potentially be out-dated or incorrect.
Even for moderate-size models, it is prohibitively expensive to retrain every time an update happens or a mistake is uncovered in the model s parametric world knowledge.
This dilemma is especially sharp in the case of factual (world) knowledge that plays important role in realizing impressive performance of large language models pretrained on large scale datasets.
However, as models scale ever larger, they become more expensive to train, making it unrealistic to frequently change model parameters.
In real world applications, it is often necessary to adjust the model s behavior.
Even if resources are ample, it is non-trivial to ensure that the retraining only modifies the target without affecting other knowledge or skills present in the model.
Furthermore, one piece of factual knowledge might have a large number of different mentions or it can be implicitly inferred from multiple sentences in the pretraining corpus, making it extremely difficult even to prepare an edited version of the training set.
Result
Large language models may become less controllable by context as the model size increases.
Even when they obtain more world knowledge and become otherwise stronger.
Therefore, new methods are needed to improve the controllability of large language models.
This gain originates from the counterfactual augmentation where the model explicitly learns the priority order when a conflict does appear with the model s pretrained factual knowledge.