A comprehensive survey of continual learning: Theory, method and application

L Wang, X Zhang, H Su, J Zhu - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org
To cope with real-world dynamics, an intelligent system needs to incrementally acquire,
update, accumulate, and exploit knowledge throughout its lifetime. This ability, known as …

Continual learning of large language models: A comprehensive survey

H Shi, Z Xu, H Wang, W Qin, W Wang, Y Wang… - arxiv preprint arxiv …, 2024 - arxiv.org
The recent success of large language models (LLMs) trained on static, pre-collected,
general datasets has sparked numerous research directions and applications. One such …

Memorization without overfitting: Analyzing the training dynamics of large language models

K Tirumala, A Markosyan… - Advances in …, 2022 - proceedings.neurips.cc
Despite their wide adoption, the underlying training and memorization dynamics of very
large language models is not well understood. We empirically study exact memorization in …

Simple and scalable strategies to continually pre-train large language models

A Ibrahim, B Thérien, K Gupta, ML Richter… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models (LLMs) are routinely pre-trained on billions of tokens, only to start
the process over again once new data becomes available. A much more efficient solution is …

Architecture matters in continual learning

SI Mirzadeh, A Chaudhry, D Yin, T Nguyen… - arxiv preprint arxiv …, 2022 - arxiv.org
A large body of research in continual learning is devoted to overcoming the catastrophic
forgetting of neural networks by designing new algorithms that are robust to the distribution …

The ideal continual learner: An agent that never forgets

L Peng, P Giampouras, R Vidal - … Conference on Machine …, 2023 - proceedings.mlr.press
The goal of continual learning is to find a model that solves multiple learning tasks which are
presented sequentially to the learner. A key challenge in this setting is that the learner may" …

Learning and forgetting unsafe examples in large language models

J Zhao, Z Deng, D Madras, J Zou, M Ren - arxiv preprint arxiv:2312.12736, 2023 - arxiv.org
As the number of large language models (LLMs) released to the public grows, there is a
pressing need to understand the safety implications associated with these models learning …

How catastrophic can catastrophic forgetting be in linear regression?

I Evron, E Moroshko, R Ward… - … on Learning Theory, 2022 - proceedings.mlr.press
To better understand catastrophic forgetting, we study fitting an overparameterized linear
model to a sequence of tasks with different input distributions. We analyze how much the …

Membership inference attacks and defenses in classification models

J Li, N Li, B Ribeiro - Proceedings of the Eleventh ACM Conference on …, 2021 - dl.acm.org
We study the membership inference (MI) attack against classifiers, where the attacker's goal
is to determine whether a data instance was used for training the classifier. Through …

Coscl: Cooperation of small continual learners is stronger than a big one

L Wang, X Zhang, Q Li, J Zhu, Y Zhong - European Conference on …, 2022 - Springer
Continual learning requires incremental compatibility with a sequence of tasks. However,
the design of model architecture remains an open question: In general, learning all tasks …