Belfusion: Latent diffusion for behavior-driven human motion prediction
Stochastic human motion prediction (HMP) has generally been tackled with generative
adversarial networks and variational autoencoders. Most prior works aim at predicting highly …
adversarial networks and variational autoencoders. Most prior works aim at predicting highly …
Breaking the limits of text-conditioned 3d motion synthesis with elaborative descriptions
Given its wide applications, there is increasing focus on generating 3D human motions from
textual descriptions. Differing from the majority of previous works, which regard actions as …
textual descriptions. Differing from the majority of previous works, which regard actions as …
Diverse human motion prediction via gumbel-softmax sampling from an auxiliary space
Diverse human motion prediction aims at predicting multiple possible future pose
sequences from a sequence of observed poses. Previous approaches usually employ deep …
sequences from a sequence of observed poses. Previous approaches usually employ deep …
Text Motion Translator: A Bi-directional Model for Enhanced 3D Human Motion Generation from Open-Vocabulary Descriptions
The field of 3D human motion generation from natural language descriptions, known as
Text2Motion, has gained significant attention for its potential application in industries such …
Text2Motion, has gained significant attention for its potential application in industries such …
Autoregressive Models in Vision: A Survey
Autoregressive modeling has been a huge success in the field of natural language
processing (NLP). Recently, autoregressive models have emerged as a significant area of …
processing (NLP). Recently, autoregressive models have emerged as a significant area of …
MSTP-net: Multiscale spatio-temporal parallel networks for human motion prediction
As a new rising technology, human motion prediction has broad application prospects in the
field of consumer electronics. Since different scale features have different receptive fields in …
field of consumer electronics. Since different scale features have different receptive fields in …
CS-IntroVAE: Cauchy-Schwarz Divergence-Based Introspective Variational Autoencoder
Z Yu, Y Yang, Y Zhu, B Guo, C Li - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Although generative models are still being developed, image reconstruction and generation
tasks have evolved dramatically. Since the most popular generative models still have some …
tasks have evolved dramatically. Since the most popular generative models still have some …
A Survey on Vision Autoregressive Model
K Jiang, J Huang - arxiv preprint arxiv:2411.08666, 2024 - arxiv.org
Autoregressive models have demonstrated great performance in natural language
processing (NLP) with impressive scalability, adaptability and generalizability. Inspired by …
processing (NLP) with impressive scalability, adaptability and generalizability. Inspired by …
EMO2: End-Effector Guided Audio-Driven Avatar Video Generation
In this paper, we propose a novel audio-driven talking head method capable of
simultaneously generating highly expressive facial expressions and hand gestures. Unlike …
simultaneously generating highly expressive facial expressions and hand gestures. Unlike …
Speech modeling with a hierarchical transformer dynamical vae
The dynamical variational autoencoders (DVAEs) are a family of latent-variable deep
generative models that extends the VAE to model a sequence of observed data and a …
generative models that extends the VAE to model a sequence of observed data and a …