Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing

P Liu, W Yuan, J Fu, Z Jiang, H Hayashi… - ACM Computing …, 2023 - dl.acm.org
This article surveys and organizes research works in a new paradigm in natural language
processing, which we dub “prompt-based learning.” Unlike traditional supervised learning …

Dissociating language and thought in large language models

K Mahowald, AA Ivanova, IA Blank, N Kanwisher… - Trends in Cognitive …, 2024 - cell.com
Large language models (LLMs) have come closest among all models to date to mastering
human language, yet opinions about their linguistic and cognitive capabilities remain split …

The flan collection: Designing data and methods for effective instruction tuning

S Longpre, L Hou, T Vu, A Webson… - International …, 2023 - proceedings.mlr.press
We study the design decision of publicly available instruction tuning methods, by
reproducing and breaking down the development of Flan 2022 (Chung et al., 2022) …

Finetuned language models are zero-shot learners

J Wei, M Bosma, VY Zhao, K Guu, AW Yu… - arxiv preprint arxiv …, 2021 - arxiv.org
This paper explores a simple method for improving the zero-shot learning abilities of
language models. We show that instruction tuning--finetuning language models on a …

Deep bidirectional language-knowledge graph pretraining

M Yasunaga, A Bosselut, H Ren… - Advances in …, 2022 - proceedings.neurips.cc
Pretraining a language model (LM) on text has been shown to help various downstream
NLP tasks. Recent works show that a knowledge graph (KG) can complement text data …

Multitask prompted training enables zero-shot task generalization

V Sanh, A Webson, C Raffel, SH Bach… - arxiv preprint arxiv …, 2021 - arxiv.org
Large language models have recently been shown to attain reasonable zero-shot
generalization on a diverse set of tasks (Brown et al., 2020). It has been hypothesized that …

Cross-task generalization via natural language crowdsourcing instructions

S Mishra, D Khashabi, C Baral, H Hajishirzi - arxiv preprint arxiv …, 2021 - arxiv.org
Humans (eg, crowdworkers) have a remarkable ability in solving different tasks, by simply
reading textual instructions that define them and looking at a few examples. Despite the …

Metaicl: Learning to learn in context

S Min, M Lewis, L Zettlemoyer, H Hajishirzi - arxiv preprint arxiv …, 2021 - arxiv.org
We introduce MetaICL (Meta-training for In-Context Learning), a new meta-training
framework for few-shot learning where a pretrained language model is tuned to do in …

Ties-merging: Resolving interference when merging models

P Yadav, D Tam, L Choshen… - Advances in Neural …, 2024 - proceedings.neurips.cc
Transfer learning–ie, further fine-tuning a pre-trained model on a downstream task–can
confer significant advantages, including improved downstream performance, faster …

Adapterfusion: Non-destructive task composition for transfer learning

J Pfeiffer, A Kamath, A Rücklé, K Cho… - arxiv preprint arxiv …, 2020 - arxiv.org
Sequential fine-tuning and multi-task learning are methods aiming to incorporate knowledge
from multiple tasks; however, they suffer from catastrophic forgetting and difficulties in …