Concept sliders: Lora adaptors for precise control in diffusion models

R Gandikota, J Materzyńska, T Zhou, A Torralba… - … on Computer Vision, 2024 - Springer
We present a method to create interpretable concept sliders that enable precise control over
attributes in image generations from diffusion models. Our approach identifies a low-rank …

Self-discovering interpretable diffusion latent directions for responsible text-to-image generation

H Li, C Shen, P Torr, V Tresp… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Diffusion-based models have gained significant popularity for text-to-image generation due
to their exceptional image-generation capabilities. A risk with these models is the potential …

Self-rectifying diffusion sampling with perturbed-attention guidance

D Ahn, H Cho, J Min, W Jang, J Kim, SH Kim… - … on Computer Vision, 2024 - Springer
Recent studies have demonstrated that diffusion models can generate high-quality samples,
but their quality heavily depends on sampling guidance techniques, such as classifier …

Open-vocabulary attention maps with token optimization for semantic segmentation in diffusion models

P Marcos-Manchón, R Alcover-Couso… - Proceedings of the …, 2024 - openaccess.thecvf.com
Diffusion models represent a new paradigm in text-to-image generation. Beyond generating
high-quality images from text prompts models such as Stable Diffusion have been …

Noiseclr: A contrastive learning approach for unsupervised discovery of interpretable directions in diffusion models

Y Dalva, P Yanardag - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com
Generative models have been very popular in the recent years for their image generation
capabilities. GAN-based models are highly regarded for their disentangled latent space …

Exploring low-dimensional subspaces in diffusion models for controllable image editing

S Chen, H Zhang, M Guo, Y Lu, P Wang… - arxiv preprint arxiv …, 2024 - arxiv.org
Recently, diffusion models have emerged as a powerful class of generative models. Despite
their success, there is still limited understanding of their semantic spaces. This makes it …

Unpacking sdxl turbo: Interpreting text-to-image models with sparse autoencoders

V Surkov, C Wendler, M Terekhov… - arxiv preprint arxiv …, 2024 - arxiv.org
Sparse autoencoders (SAEs) have become a core ingredient in the reverse engineering of
large-language models (LLMs). For LLMs, they have been shown to decompose …

Diffusion PID: Interpreting Diffusion via Partial Information Decomposition

S Dewan, R Zawar, P Saxena… - Advances in Neural …, 2025 - proceedings.neurips.cc
Text-to-image diffusion models have made significant progress in generating naturalistic
images from textual inputs, and demonstrate the capacity to learn and represent complex …

Global counterfactual directions

B Sobieski, P Biecek - European Conference on Computer Vision, 2024 - Springer
Despite increasing progress in development of methods for generating visual counterfactual
explanations, previous works consider them as an entirely local technique. In this work, we …

Harivo: Harnessing text-to-image models for video generation

M Kwon, SW Oh, Y Zhou, D Liu, JY Lee, H Cai… - … on Computer Vision, 2024 - Springer
We present a method to create diffusion-based video models from pretrained Text-to-Image
(T2I) models. Recently, AnimateDiff proposed freezing the T2I model while only training …