Emo: Emote portrait alive generating expressive portrait videos with audio2video diffusion model under weak conditions
In this work, we tackle the challenge of enhancing the realism and expressiveness in talking
head video generation by focusing on the dynamic and nuanced relationship between audio …
head video generation by focusing on the dynamic and nuanced relationship between audio …
Dreamvideo: Composing your dream videos with customized subject and motion
Customized generation using diffusion models has made impressive progress in image
generation but remains unsatisfactory in the challenging video generation task as it requires …
generation but remains unsatisfactory in the challenging video generation task as it requires …
A survey on generative ai and llm for video generation, understanding, and streaming
This paper offers an insightful examination of how currently top-trending AI technologies, ie,
generative artificial intelligence (Generative AI) and large language models (LLMs), are …
generative artificial intelligence (Generative AI) and large language models (LLMs), are …
Deepfake generation and detection: A benchmark and survey
Deepfake is a technology dedicated to creating highly realistic facial images and videos
under specific conditions, which has significant application potential in fields such as …
under specific conditions, which has significant application potential in fields such as …
A recipe for scaling up text-to-video generation with text-free videos
Diffusion-based text-to-video generation has witnessed impressive progress in the past year
yet still falls behind text-to-image generation. One of the key reasons is the limited scale of …
yet still falls behind text-to-image generation. One of the key reasons is the limited scale of …
Hallo: Hierarchical audio-driven visual synthesis for portrait image animation
The field of portrait image animation, driven by speech audio input, has experienced
significant advancements in the generation of realistic and dynamic portraits. This research …
significant advancements in the generation of realistic and dynamic portraits. This research …
EmoTalk3D: high-fidelity free-view synthesis of emotional 3D talking head
We present a novel approach for synthesizing 3D talking heads with controllable emotion,
featuring enhanced lip synchronization and rendering quality. Despite significant progress in …
featuring enhanced lip synchronization and rendering quality. Despite significant progress in …
Hallo2: Long-duration and high-resolution audio-driven portrait image animation
Recent advances in latent diffusion-based generative models for portrait image animation,
such as Hallo, have achieved impressive results in short-duration video synthesis. In this …
such as Hallo, have achieved impressive results in short-duration video synthesis. In this …
Survey: Transformer-based Models in Data Modality Conversion
Transformers have made significant strides across various artificial intelligence domains,
including natural language processing, computer vision, and audio processing. This …
including natural language processing, computer vision, and audio processing. This …
Loopy: Taming audio-driven portrait avatar with long-term motion dependency
With the introduction of diffusion-based video generation techniques, audio-conditioned
human video generation has recently achieved significant breakthroughs in both the …
human video generation has recently achieved significant breakthroughs in both the …