Motion-i2v: Consistent and controllable image-to-video generation with explicit motion modeling

X Shi, Z Huang, FY Wang, W Bian, D Li… - ACM SIGGRAPH 2024 …, 2024 - dl.acm.org
We introduce Motion-I2V, a novel framework for consistent and controllable text-guided
image-to-video generation (I2V). In contrast to previous methods that directly learn the …

4real: Towards photorealistic 4d scene generation via video diffusion models

H Yu, C Wang, P Zhuang… - Advances in …, 2025 - proceedings.neurips.cc
Existing dynamic scene generation methods mostly rely on distilling knowledge from pre-
trained 3D generative models, which are typically fine-tuned on synthetic object datasets. As …

Motionbooth: Motion-aware customized text-to-video generation

J Wu, X Li, Y Zeng, J Zhang, Q Zhou, Y Li… - arxiv preprint arxiv …, 2024 - arxiv.org
In this work, we present MotionBooth, an innovative framework designed for animating
customized subjects with precise control over both object and camera movements. By …

Animate your motion: Turning still images into dynamic videos

M Li, B Wan, MF Moens, T Tuytelaars - European Conference on Computer …, 2024 - Springer
In recent years, diffusion models have made remarkable strides in text-to-video generation,
sparking a quest for enhanced control over video outputs to more accurately reflect user …

Chattracker: Enhancing visual tracking performance via chatting with multimodal large language model

Y Sun, F Yu, S Chen, Y Zhang… - Advances in …, 2025 - proceedings.neurips.cc
Visual object tracking aims to locate a targeted object in a video sequence based on an
initial bounding box. Recently, Vision-Language~(VL) trackers have proposed to utilize …

Cami2v: Camera-controlled image-to-video diffusion model

G Zheng, T Li, R Jiang, Y Lu, T Wu, X Li - arxiv preprint arxiv:2410.15957, 2024 - arxiv.org
Recently, camera pose, as a user-friendly and physics-related condition, has been
introduced into text-to-video diffusion model for camera control. However, existing methods …

Easycontrol: Transfer controlnet to video diffusion for controllable generation and interpolation

C Wang, J Gu, P Hu, H Zhao, Y Guo, J Han… - arxiv preprint arxiv …, 2024 - arxiv.org
Following the advancements in text-guided image generation technology exemplified by
Stable Diffusion, video generation is gaining increased attention in the academic …

CinePreGen: Camera Controllable Video Previsualization via Engine-powered Diffusion

Y Chen, A Rao, X Jiang, S **ao, R Ma, Z Wang… - arxiv preprint arxiv …, 2024 - arxiv.org
With advancements in video generative AI models (eg, SORA), creators are increasingly
using these techniques to enhance video previsualization. However, they face challenges …

3D Object Manipulation in a Single Image using Generative Models

R Zhao, Z Zhang, Z Yang, Y Yang - arxiv preprint arxiv:2501.12935, 2025 - arxiv.org
Object manipulation in images aims to not only edit the object's presentation but also gift
objects with motion. Previous methods encountered challenges in concurrently handling …

Zero-shot controllable image-to-video animation via motion decomposition

S Yu, JZ Fang, J Zheng, G Sigurdsson… - Proceedings of the …, 2024 - dl.acm.org
In this paper, we introduce a new challenging task called Zero-Shot Controllable Image-to-
Video Animation, where the goal is to animate an image based on motion trajectories …