Videobooth: Diffusion-based video generation with image prompts

Y Jiang, T Wu, S Yang, C Si, D Lin… - Proceedings of the …, 2024‏ - openaccess.thecvf.com
Text-driven video generation witnesses rapid progress. However merely using text prompts
is not enough to depict the desired subject appearance that accurately aligns with users' …

Consisti2v: Enhancing visual consistency for image-to-video generation

W Ren, H Yang, G Zhang, C Wei, X Du… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Image-to-video (I2V) generation aims to use the initial frame (alongside a text prompt) to
create a video sequence. A grand challenge in I2V generation is to maintain visual …

AnyV2V: A Tuning-Free Framework For Any Video-to-Video Editing Tasks

M Ku, C Wei, W Ren, H Yang, W Chen - Transactions on Machine …, 2024‏ - openreview.net
In the dynamic field of digital content creation using generative models, state-of-the-art video
editing models still do not offer the level of quality and control that users desire. Previous …

Freelong: Training-free long video generation with spectralblend temporal attention

Y Lu, Y Liang, L Zhu, Y Yang - arxiv preprint arxiv:2407.19918, 2024‏ - arxiv.org
Video diffusion models have made substantial progress in various video generation
applications. However, training models for long video generation tasks require significant …

Deep diffusion image prior for efficient ood adaptation in 3d inverse problems

H Chung, JC Ye - European Conference on Computer Vision, 2024‏ - Springer
Recent inverse problem solvers that leverage generative diffusion priors have garnered
significant attention due to their exceptional quality. However, adaptation of the prior is …

Cinemo: Consistent and controllable image animation with motion diffusion models

X Ma, Y Wang, G Jia, X Chen, YF Li, C Chen… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Diffusion models have achieved great progress in image animation due to powerful
generative capabilities. However, maintaining spatio-temporal consistency with detailed …

Sg-i2v: Self-guided trajectory control in image-to-video generation

K Namekata, S Bahmani, Z Wu, Y Kant… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Methods for image-to-video generation have achieved impressive, photo-realistic quality.
However, adjusting specific elements in generated videos, such as object motion or camera …

Flexiedit: Frequency-aware latent refinement for enhanced non-rigid editing

G Koo, S Yoon, JW Hong, CD Yoo - European Conference on Computer …, 2024‏ - Springer
Current image editing methods primarily utilize DDIM Inversion, employing a two-branch
diffusion approach to preserve the attributes and layout of the original image. However …

Identifying and solving conditional image leakage in image-to-video diffusion model

M Zhao, H Zhu, C **ang, K Zheng, C Li… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Diffusion models have obtained substantial progress in image-to-video generation.
However, in this paper, we find that these models tend to generate videos with less motion …

Animate3d: Animating any 3d model with multi-view video diffusion

Y Jiang, C Yu, C Cao, F Wang, W Hu, J Gao - arxiv preprint arxiv …, 2024‏ - arxiv.org
Recent advances in 4D generation mainly focus on generating 4D content by distilling pre-
trained text or single-view image-conditioned models. It is inconvenient for them to take …