Weakly-Supervised Emotion Transition Learning for Diverse 3D Co-speech Gesture Generation
Generating vivid and emotional 3D co-speech gestures is crucial for virtual avatar animation
in human-machine interaction applications. While the existing methods enable generating …
in human-machine interaction applications. While the existing methods enable generating …
Semantic Gesticulator: Semantics-Aware Co-Speech Gesture Synthesis
In this work, we present Semantic Gesticulator, a novel framework designed to synthesize
realistic gestures accompanying speech with strong semantic correspondence. Semantically …
realistic gestures accompanying speech with strong semantic correspondence. Semantically …
Mambatalk: Efficient holistic gesture synthesis with selective state space models
Gesture synthesis is a vital realm of human-computer interaction, with wide-ranging
applications across various fields like film, robotics, and virtual reality. Recent advancements …
applications across various fields like film, robotics, and virtual reality. Recent advancements …
Towards Variable and Coordinated Holistic Co-Speech Motion Generation
This paper addresses the problem of generating lifelike holistic co-speech motions for 3D
avatars focusing on two key aspects: variability and coordination. Variability allows the …
avatars focusing on two key aspects: variability and coordination. Variability allows the …
EGGesture: Entropy-Guided Vector Quantized Variational AutoEncoder for Co-Speech Gesture Generation
Y **ao, K Shu, H Zhang, B Yin, WS Cheang… - Proceedings of the …, 2024 - dl.acm.org
Co-Speech gesture generation encounters challenges with imbalanced, long-tailed gesture
distributions. While recent methods typically address this by employing Vector Quantized …
distributions. While recent methods typically address this by employing Vector Quantized …
ProbTalk3D: Non-Deterministic Emotion Controllable Speech-Driven 3D Facial Animation Synthesis Using VQ-VAE
Audio-driven 3D facial animation synthesis has been an active field of research with
attention from both academia and industry. While there are promising results in this area …
attention from both academia and industry. While there are promising results in this area …
RapVerse: Coherent Vocals and Whole-Body Motions Generations from Text
In this work, we introduce a challenging task for simultaneously generating 3D holistic body
motions and singing vocals directly from textual lyrics inputs, advancing beyond existing …
motions and singing vocals directly from textual lyrics inputs, advancing beyond existing …
VCoME: Verbal Video Composition with Multimodal Editing Effects
W Gong, X **, X Li, D He, X Wu - arxiv preprint arxiv:2407.04697, 2024 - arxiv.org
Verbal videos, featuring voice-overs or text overlays, provide valuable content but present
significant challenges in composition, especially when incorporating editing effects to …
significant challenges in composition, especially when incorporating editing effects to …
GestureLSM: Latent Shortcut based Co-Speech Gesture Generation with Spatial-Temporal Modeling
Controlling human gestures based on speech signals presents a significant challenge in
computer vision. While existing works did preliminary studies of generating holistic co …
computer vision. While existing works did preliminary studies of generating holistic co …
REALISTIC-GESTURE: CO-SPEECH GESTURE VIDEO GENERATION THROUGH CONTEXT-AWARE GESTURE REPRESENTATION
LS Generation - openreview.net
Co-speech gesture generation is crucial for creating lifelike avatars and enhancing human-
computer interactions by synchronizing gestures with speech in computer vision. Despite …
computer interactions by synchronizing gestures with speech in computer vision. Despite …