State of the art on diffusion models for visual computing

R Po, W Yifan, V Golyanik, K Aberman… - Computer Graphics …, 2024‏ - Wiley Online Library
The field of visual computing is rapidly advancing due to the emergence of generative
artificial intelligence (AI), which unlocks unprecedented capabilities for the generation …

A survey of multimodal-guided image editing with text-to-image diffusion models

X Shuai, H Ding, X Ma, R Tu, YG Jiang… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Image editing aims to edit the given synthetic or real image to meet the specific requirements
from users. It is widely studied in recent years as a promising and challenging field of …

Sparsectrl: Adding sparse controls to text-to-video diffusion models

Y Guo, C Yang, A Rao, M Agrawala, D Lin… - European Conference on …, 2024‏ - Springer
The development of text-to-video (T2V), ie, generating videos with a given text prompt, has
been significantly advanced in recent years. However, relying solely on text prompts often …

Style aligned image generation via shared attention

A Hertz, A Voynov, S Fruchter… - Proceedings of the …, 2024‏ - openaccess.thecvf.com
Abstract Large-scale Text-to-Image (T2I) models have rapidly gained prominence across
creative fields generating visually compelling outputs from textual prompts. However …

Break-a-scene: Extracting multiple concepts from a single image

O Avrahami, K Aberman, O Fried, D Cohen-Or… - SIGGRAPH Asia 2023 …, 2023‏ - dl.acm.org
Text-to-image model personalization aims to introduce a user-provided concept to the
model, allowing its synthesis in diverse contexts. However, current methods primarily focus …

Videobooth: Diffusion-based video generation with image prompts

Y Jiang, T Wu, S Yang, C Si, D Lin… - Proceedings of the …, 2024‏ - openaccess.thecvf.com
Text-driven video generation witnesses rapid progress. However merely using text prompts
is not enough to depict the desired subject appearance that accurately aligns with users' …

Subject-diffusion: Open domain personalized text-to-image generation without test-time fine-tuning

J Ma, J Liang, C Chen, H Lu - ACM SIGGRAPH 2024 Conference …, 2024‏ - dl.acm.org
Recent progress in personalized image generation using diffusion models has been
significant. However, development in the area of open-domain and test-time fine-tuning-free …

A neural space-time representation for text-to-image personalization

Y Alaluf, E Richardson, G Metzer… - ACM Transactions on …, 2023‏ - dl.acm.org
A key aspect of text-to-image personalization methods is the manner in which the target
concept is represented within the generative process. This choice greatly affects the visual …

Fresco: Spatial-temporal correspondence for zero-shot video translation

S Yang, Y Zhou, Z Liu, CC Loy - Proceedings of the IEEE …, 2024‏ - openaccess.thecvf.com
The remarkable efficacy of text-to-image diffusion models has motivated extensive
exploration of their potential application in video domains. Zero-shot methods seek to extend …

Customizable image synthesis with multiple subjects

Z Liu, Y Zhang, Y Shen, K Zheng… - Advances in neural …, 2023‏ - proceedings.neurips.cc
Synthesizing images with user-specified subjects has received growing attention due to its
practical applications. Despite the recent success in single subject customization, existing …