Defensive unlearning with adversarial training for robust concept erasure in diffusion models

Y Zhang, X Chen, J Jia, Y Zhang… - Advances in …, 2025 - proceedings.neurips.cc
Diffusion models (DMs) have achieved remarkable success in text-to-image generation, but
they also pose safety risks, such as the potential generation of harmful content and copyright …

Reliable and efficient concept erasure of text-to-image diffusion models

C Gong, K Chen, Z Wei, J Chen, YG Jiang - European Conference on …, 2024 - Springer
Text-to-image models encounter safety issues, including concerns related to copyright and
Not-Safe-For-Work (NSFW) content. Despite several methods have been proposed for …

SafeGen: Mitigating Sexually Explicit Content Generation in Text-to-Image Models

X Li, Y Yang, J Deng, C Yan, Y Chen, X Ji… - Proceedings of the 2024 …, 2024 - dl.acm.org
Text-to-image (T2I) models, such as Stable Diffusion, have exhibited remarkable
performance in generating high-quality images from text descriptions in recent years …

Safree: Training-free and adaptive guard for safe text-to-image and video generation

J Yoon, S Yu, V Patil, H Yao, M Bansal - arxiv preprint arxiv:2410.12761, 2024 - arxiv.org
Recent advances in diffusion models have significantly enhanced their ability to generate
high-quality images and videos, but they have also increased the risk of producing unsafe …

Vbench++: Comprehensive and versatile benchmark suite for video generative models

Z Huang, F Zhang, X Xu, Y He, J Yu, Z Dong… - arxiv preprint arxiv …, 2024 - arxiv.org
Video generation has witnessed significant advancements, yet evaluating these models
remains a challenge. A comprehensive evaluation benchmark for video generation is …

Direct unlearning optimization for robust and safe text-to-image models

YH Park, S Yun, JH Kim, J Kim, G Jang, Y Jeong… - arxiv preprint arxiv …, 2024 - arxiv.org
Recent advancements in text-to-image (T2I) models have greatly benefited from large-scale
datasets, but they also pose significant risks due to the potential generation of unsafe …

Rt-attack: Jailbreaking text-to-image models via random token

S Gao, X Jia, Y Huang, R Duan, J Gu, Y Liu… - arxiv preprint arxiv …, 2024 - arxiv.org
Recently, Text-to-Image (T2I) models have achieved remarkable success in image
generation and editing, yet these models still have many potential issues, particularly in …

Meta-Unlearning on Diffusion Models: Preventing Relearning Unlearned Concepts

H Gao, T Pang, C Du, T Hu, Z Deng, M Lin - arxiv preprint arxiv …, 2024 - arxiv.org
With the rapid progress of diffusion-based content generation, significant efforts are being
made to unlearn harmful or copyrighted concepts from pretrained diffusion models (DMs) to …

Replication in visual diffusion models: A survey and outlook

W Wang, Y Sun, Z Yang, Z Hu, Z Tan… - arxiv preprint arxiv …, 2024 - arxiv.org
Visual diffusion models have revolutionized the field of creative AI, producing high-quality
and diverse content. However, they inevitably memorize training images or videos …

Jailbreak Attacks and Defenses against Multimodal Generative Models: A Survey

X Liu, X Cui, P Li, Z Li, H Huang, S **a, M Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
The rapid evolution of multimodal foundation models has led to significant advancements in
cross-modal understanding and generation across diverse modalities, including text …