Rethinking machine unlearning for large language models
We explore machine unlearning (MU) in the domain of large language models (LLMs),
referred to as LLM unlearning. This initiative aims to eliminate undesirable data influence …
referred to as LLM unlearning. This initiative aims to eliminate undesirable data influence …
To generate or not? safety-driven unlearned diffusion models are still easy to generate unsafe images... for now
The recent advances in diffusion models (DMs) have revolutionized the generation of
realistic and complex images. However, these models also introduce potential safety …
realistic and complex images. However, these models also introduce potential safety …
Practical unlearning for large language models
While LLMs have demonstrated impressive performance across various domains and tasks,
their security issues have become increasingly severe. Machine unlearning (MU) has …
their security issues have become increasingly severe. Machine unlearning (MU) has …
Jogging the Memory of Unlearned Models Through Targeted Relearning Attacks
Machine unlearning is a promising approach to mitigate undesirable memorization of
training data in ML models. However, in this work we show that existing approaches for …
training data in ML models. However, in this work we show that existing approaches for …
Meta-unlearning on diffusion models: Preventing relearning unlearned concepts
With the rapid progress of diffusion-based content generation, significant efforts are being
made to unlearn harmful or copyrighted concepts from pretrained diffusion models (DMs) to …
made to unlearn harmful or copyrighted concepts from pretrained diffusion models (DMs) to …
On effects of steering latent representation for large language model unlearning
Representation Misdirection for Unlearning (RMU), which steers model representation in the
intermediate layer to a target random representation, is an effective method for large …
intermediate layer to a target random representation, is an effective method for large …
Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice
We articulate fundamental mismatches between technical methods for machine unlearning
in Generative AI, and documented aspirations for broader impact that these methods could …
in Generative AI, and documented aspirations for broader impact that these methods could …
Alternate preference optimization for unlearning factual knowledge in large language models
Machine unlearning aims to efficiently eliminate the influence of specific training data,
known as the forget set, from the model. However, existing unlearning methods for Large …
known as the forget set, from the model. However, existing unlearning methods for Large …
A Closer Look at Machine Unlearning for Large Language Models
Large language models (LLMs) may memorize sensitive or copyrighted content, raising
privacy and legal concerns. Due to the high cost of retraining from scratch, researchers …
privacy and legal concerns. Due to the high cost of retraining from scratch, researchers …
Open Problems in Machine Unlearning for AI Safety
As AI systems become more capable, widely deployed, and increasingly autonomous in
critical areas such as cybersecurity, biological research, and healthcare, ensuring their …
critical areas such as cybersecurity, biological research, and healthcare, ensuring their …