Motion-i2v: Consistent and controllable image-to-video generation with explicit motion modeling
We introduce Motion-I2V, a novel framework for consistent and controllable text-guided
image-to-video generation (I2V). In contrast to previous methods that directly learn the …
image-to-video generation (I2V). In contrast to previous methods that directly learn the …
Consisti2v: Enhancing visual consistency for image-to-video generation
Image-to-video (I2V) generation aims to use the initial frame (alongside a text prompt) to
create a video sequence. A grand challenge in I2V generation is to maintain visual …
create a video sequence. A grand challenge in I2V generation is to maintain visual …
Hallo: Hierarchical audio-driven visual synthesis for portrait image animation
The field of portrait image animation, driven by speech audio input, has experienced
significant advancements in the generation of realistic and dynamic portraits. This research …
significant advancements in the generation of realistic and dynamic portraits. This research …
AniClipart: Clipart animation with text-to-video priors
Clipart, a pre-made graphic art form, offers a convenient and efficient way of illustrating
visual content. Traditional workflows to convert static clipart images into motion sequences …
visual content. Traditional workflows to convert static clipart images into motion sequences …
Vbench++: Comprehensive and versatile benchmark suite for video generative models
Video generation has witnessed significant advancements, yet evaluating these models
remains a challenge. A comprehensive evaluation benchmark for video generation is …
remains a challenge. A comprehensive evaluation benchmark for video generation is …
Mardini: Masked autoregressive diffusion for video generation at scale
We introduce MarDini, a new family of video diffusion models that integrate the advantages
of masked auto-regression (MAR) into a unified diffusion model (DM) framework. Here, MAR …
of masked auto-regression (MAR) into a unified diffusion model (DM) framework. Here, MAR …
Draw an audio: Leveraging multi-instruction for video-to-audio synthesis
Foley is a term commonly used in filmmaking, referring to the addition of daily sound effects
to silent films or videos to enhance the auditory experience. Video-to-Audio (V2A), as a …
to silent films or videos to enhance the auditory experience. Video-to-Audio (V2A), as a …
Videoelevator: Elevating video generation quality with versatile text-to-image diffusion models
Text-to-image diffusion models (T2I) have demonstrated unprecedented capabilities in
creating realistic and aesthetic images. On the contrary, text-to-video diffusion models (T2V) …
creating realistic and aesthetic images. On the contrary, text-to-video diffusion models (T2V) …
Atomovideo: High fidelity image-to-video generation
Recently, video generation has achieved significant rapid development based on superior
text-to-image generation techniques. In this work, we propose a high fidelity framework for …
text-to-image generation techniques. In this work, we propose a high fidelity framework for …
ObjCtrl-2.5 D: Training-free Object Control with Camera Poses
This study aims to achieve more precise and versatile object control in image-to-video (I2V)
generation. Current methods typically represent the spatial movement of target objects with …
generation. Current methods typically represent the spatial movement of target objects with …