Mace: Mass concept erasure in diffusion models

S Lu, Z Wang, L Li, Y Liu… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
The rapid expansion of large-scale text-to-image diffusion models has raised growing
concerns regarding their potential misuse in creating harmful or misleading content. In this …

[PDF][PDF] Erasing concepts from text-to-image diffusion models with few-shot unlearning

M Fuchi, T Takagi - arxiv preprint arxiv:2405.07288, 2024 - bmva-archive.org.uk
Generating images from text has become easier because of the scaling of diffusion models
and advancements in the field of vision and language. These models are trained using vast …

Decomposing and editing predictions by modeling model computation

H Shah, A Ilyas, A Madry - arxiv preprint arxiv:2404.11534, 2024 - arxiv.org
How does the internal computation of a machine learning model transform inputs into
predictions? In this paper, we introduce a task called component modeling that aims to …

LoopAnimate: Loopable Salient Object Animation

F Wang, P Liu, H Hu, D Meng, J Su, J Xu… - Proceedings of the 6th …, 2024 - dl.acm.org
Research on diffusion model-based video generation has advanced rapidly. However,
limitations in object fidelity and generation length hinder its practical applications …

Pioneering Reliable Assessment in Text-to-Image Knowledge Editing: Leveraging a Fine-Grained Dataset and an Innovative Criterion

H Gu, K Zhou, Y Wang, R Wang, X Wang - arxiv preprint arxiv:2409.17928, 2024 - arxiv.org
During pre-training, the Text-to-Image (T2I) diffusion models encode factual knowledge into
their parameters. These parameterized facts enable realistic image generation, but they may …

Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models

Y Zhang, X Chen, J Jia, Y Zhang, C Fan, J Liu… - arxiv preprint arxiv …, 2024 - arxiv.org
Diffusion models (DMs) have achieved remarkable success in text-to-image generation, but
they also pose safety risks, such as the potential generation of harmful content and copyright …

Espresso: Robust Concept Filtering in Text-to-Image Models

A Das, V Duddu, R Zhang, N Asokan - arxiv preprint arxiv:2404.19227, 2024 - arxiv.org
Diffusion-based text-to-image (T2I) models generate high-fidelity images for given textual
prompts. They are trained on large datasets scraped from the Internet, potentially containing …

Joint Diffusion models in Continual Learning

P Skierś, K Deja - arxiv preprint arxiv:2411.08224, 2024 - arxiv.org
In this work, we introduce JDCL-a new method for continual learning with generative
rehearsal based on joint diffusion models. Neural networks suffer from catastrophic …

Distorting Embedding Space for Safety: A Defense Mechanism for Adversarially Robust Diffusion Models

J Ahn, H Jung - arxiv preprint arxiv:2501.18877, 2025 - arxiv.org
Text-to-image diffusion models show remarkable generation performance following text
prompts, but risk generating Not Safe For Work (NSFW) contents from unsafe prompts …

SAeUron: Interpretable Concept Unlearning in Diffusion Models with Sparse Autoencoders

B Cywiński, K Deja - arxiv preprint arxiv:2501.18052, 2025 - arxiv.org
Recent machine unlearning approaches offer promising solution for removing unwanted
concepts from diffusion models. However, traditional methods, which largely rely on fine …