Continual learning of large language models: A comprehensive survey
The recent success of large language models (LLMs) trained on static, pre-collected,
general datasets has sparked numerous research directions and applications. One such …
general datasets has sparked numerous research directions and applications. One such …
Continual learning of natural language processing tasks: A survey
Continual learning (CL) is a learning paradigm that emulates the human capability of
learning and accumulating knowledge continually without forgetting the previously learned …
learning and accumulating knowledge continually without forgetting the previously learned …
Ernie-vilg 2.0: Improving text-to-image diffusion model with knowledge-enhanced mixture-of-denoising-experts
Recent progress in diffusion models has revolutionized the popular technology of text-to-
image generation. While existing approaches could produce photorealistic high-resolution …
image generation. While existing approaches could produce photorealistic high-resolution …
Modular deep learning
Transfer learning has recently become the dominant paradigm of machine learning. Pre-
trained models fine-tuned for downstream tasks achieve better performance with fewer …
trained models fine-tuned for downstream tasks achieve better performance with fewer …
Branch-train-merge: Embarrassingly parallel training of expert language models
We present Branch-Train-Merge (BTM), a communication-efficient algorithm for
embarrassingly parallel training of large language models (LLMs). We show it is possible to …
embarrassingly parallel training of large language models (LLMs). We show it is possible to …
Silo language models: Isolating legal risk in a nonparametric datastore
The legality of training language models (LMs) on copyrighted or otherwise restricted data is
under intense debate. However, as we show, model performance significantly degrades if …
under intense debate. However, as we show, model performance significantly degrades if …
Large language models (LLMs): survey, technical frameworks, and future challenges
P Kumar - Artificial Intelligence Review, 2024 - Springer
Artificial intelligence (AI) has significantly impacted various fields. Large language models
(LLMs) like GPT-4, BARD, PaLM, Megatron-Turing NLG, Jurassic-1 Jumbo etc., have …
(LLMs) like GPT-4, BARD, PaLM, Megatron-Turing NLG, Jurassic-1 Jumbo etc., have …
Lifelong language pretraining with distribution-specialized experts
Pretraining on a large-scale corpus has become a standard method to build general
language models (LMs). Adapting a model to new data distributions targeting different …
language models (LMs). Adapting a model to new data distributions targeting different …
A survey on mixture of experts
Large language models (LLMs) have garnered unprecedented advancements across
diverse fields, ranging from natural language processing to computer vision and beyond …
diverse fields, ranging from natural language processing to computer vision and beyond …
Lifelong pretraining: Continually adapting language models to emerging corpora
Pretrained language models (PTLMs) are typically learned over a large, static corpus and
further fine-tuned for various downstream tasks. However, when deployed in the real world …
further fine-tuned for various downstream tasks. However, when deployed in the real world …