- Academic Search

State of the art on diffusion models for visual computing

R Po, W Yifan, V Golyanik, K Aberman… - Computer Graphics …, 2024 - Wiley Online Library

The field of visual computing is rapidly advancing due to the emergence of generative
artificial intelligence (AI), which unlocks unprecedented capabilities for the generation …

Save Cite Cited by 91 Related articles All 12 versions Free GPT-4

Save Cite Cited by 228 Related articles All 6 versions Free GPT-4 View as HTML

Preserve your own correlation: A noise prior for video diffusion models

S Ge, S Nah, G Liu, T Poon, A Tao… - Proceedings of the …, 2023 - openaccess.thecvf.com

Despite tremendous progress in generating high-quality images using diffusion models,
synthesizing a sequence of animated frames that are both photorealistic and temporally …

Dynamicrafter: Animating open-domain images with video diffusion priors

J **ng, M **a, Y Zhang, H Chen, W Yu, H Liu… - … on Computer Vision, 2024 - Springer

Animating a still image offers an engaging visual experience. Traditional image animation
techniques mainly focus on animating natural scenes with stochastic dynamics (eg clouds …

Save Cite Cited by 159 Related articles All 2 versions Free GPT-4

Save Cite Cited by 217 Related articles All 6 versions Free GPT-4 View as HTML

Pix2video: Video editing using image diffusion

D Ceylan, CHP Huang, NJ Mitra - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Image diffusion models, trained on massive image collections, have emerged as the most
versatile image generator model in terms of quality and diversity. They support inverting real …

Photorealistic video generation with diffusion models

A Gupta, L Yu, K Sohn, X Gu, M Hahn, FF Li… - … on Computer Vision, 2024 - Springer

We present WALT, a diffusion transformer for photorealistic video generation from text
prompts. Our approach has two key design decisions. First, we use a causal encoder to …

Save Cite Cited by 127 Related articles All 3 versions Free GPT-4

Save Cite Cited by 175 Related articles All 5 versions Free GPT-4 View as HTML

Videopoet: A large language model for zero-shot video generation

D Kondratyuk, L Yu, X Gu, J Lezama, J Huang… - arxiv preprint arxiv …, 2023 - arxiv.org

We present VideoPoet, a language model capable of synthesizing high-quality video, with
matching audio, from a large variety of conditioning signals. VideoPoet employs a decoder …

Save Cite Cited by 145 Related articles All 5 versions Free GPT-4 View as HTML

Stablevideo: Text-driven consistency-aware diffusion video editing

W Chai, X Guo, G Wang, Y Lu - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Diffusion-based methods can generate realistic images and videos, but they struggle to edit
existing objects in a video while preserving their geometry over time. This prevents diffusion …

Sparsectrl: Adding sparse controls to text-to-video diffusion models

Y Guo, C Yang, A Rao, M Agrawala, D Lin… - European Conference on …, 2024 - Springer

The development of text-to-video (T2V), ie, generating videos with a given text prompt, has
been significantly advanced in recent years. However, relying solely on text prompts often …

Save Cite Cited by 76 Related articles All 2 versions Free GPT-4