State of the art on diffusion models for visual computing

R Po, W Yifan, V Golyanik, K Aberman… - Computer Graphics …, 2024 - Wiley Online Library
The field of visual computing is rapidly advancing due to the emergence of generative
artificial intelligence (AI), which unlocks unprecedented capabilities for the generation …

Recent advances in 3d gaussian splatting

T Wu, YJ Yuan, LX Zhang, J Yang, YP Cao… - Computational Visual …, 2024 - Springer
The emergence of 3D Gaussian splatting (3DGS) has greatly accelerated rendering in novel
view synthesis. Unlike neural implicit representations like neural radiance fields (NeRFs) …

Anydoor: Zero-shot object-level image customization

X Chen, L Huang, Y Liu, Y Shen… - Proceedings of the …, 2024 - openaccess.thecvf.com
This work presents AnyDoor a diffusion-based image generator with the power to teleport
target objects to new scenes at user-specified locations with desired shapes. Instead of …

Magicanimate: Temporally consistent human image animation using diffusion model

Z Xu, J Zhang, JH Liew, H Yan, JW Liu… - Proceedings of the …, 2024 - openaccess.thecvf.com
This paper studies the human image animation task which aims to generate a video of a
certain reference identity following a particular motion sequence. Existing animation works …

Dragdiffusion: Harnessing diffusion models for interactive point-based image editing

Y Shi, C Xue, JH Liew, J Pan, H Yan… - Proceedings of the …, 2024 - openaccess.thecvf.com
Accurate and controllable image editing is a challenging task that has attracted significant
attention recently. Notably DragGAN developed by Pan et al.(2023) is an interactive point …

Fastcomposer: Tuning-free multi-subject image generation with localized attention

G **ao, T Yin, WT Freeman, F Durand… - International Journal of …, 2024 - Springer
Diffusion models excel at text-to-image generation, especially in subject-driven generation
for personalized images. However, existing methods are inefficient due to the subject …

Style aligned image generation via shared attention

A Hertz, A Voynov, S Fruchter… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract Large-scale Text-to-Image (T2I) models have rapidly gained prominence across
creative fields generating visually compelling outputs from textual prompts. However …

Storydiffusion: Consistent self-attention for long-range image and video generation

Y Zhou, D Zhou, MM Cheng… - Advances in Neural …, 2025 - proceedings.neurips.cc
For recent diffusion-based generative models, maintaining consistent content across a
series of generated images, especially those containing subjects and complex details …

Mastering text-to-image diffusion: Recaptioning, planning, and generating with multimodal llms

L Yang, Z Yu, C Meng, M Xu, S Ermon… - Forty-first International …, 2024 - openreview.net
Diffusion models have exhibit exceptional performance in text-to-image generation and
editing. However, existing methods often face challenges when handling complex text …

Follow your pose: Pose-guided text-to-video generation using pose-free videos

Y Ma, Y He, X Cun, X Wang, S Chen, X Li… - Proceedings of the AAAI …, 2024 - ojs.aaai.org
Generating text-editable and pose-controllable character videos have an imperious demand
in creating various digital human. Nevertheless, this task has been restricted by the absence …