Deep learning for visual speech analysis: A survey

C Sheng, G Kuang, L Bai, C Hou, Y Guo… - … on Pattern Analysis …, 2024‏ - ieeexplore.ieee.org
Visual speech, referring to the visual domain of speech, has attracted increasing attention
due to its wide applications, such as public security, medical treatment, military defense, and …

Audio-driven emotional video portraits

X Ji, H Zhou, K Wang, W Wu, CC Loy… - Proceedings of the …, 2021‏ - openaccess.thecvf.com
Despite previous success in generating audio-driven talking heads, most of the previous
studies focus on the correlation between speech content and the mouth shape. Facial …

Dexmv: Imitation learning for dexterous manipulation from human videos

Y Qin, YH Wu, S Liu, H Jiang, R Yang, Y Fu… - European Conference on …, 2022‏ - Springer
While significant progress has been made on understanding hand-object interactions in
computer vision, it is still very challenging for robots to perform complex dexterous …

Controllable person image synthesis with attribute-decomposed gan

Y Men, Y Mao, Y Jiang, WY Ma… - Proceedings of the IEEE …, 2020‏ - openaccess.thecvf.com
This paper introduces the Attribute-Decomposed GAN, a novel generative model for
controllable person image synthesis, which can produce realistic person images with …

Gan compression: Efficient architectures for interactive conditional gans

M Li, J Lin, Y Ding, Z Liu, JY Zhu… - Proceedings of the …, 2020‏ - openaccess.thecvf.com
Abstract Conditional Generative Adversarial Networks (cGANs) have enabled controllable
image synthesis for many computer vision and graphics applications. However, recent …

Emmn: Emotional motion memory network for audio-driven emotional talking face generation

S Tan, B Ji, Y Pan - Proceedings of the IEEE/CVF …, 2023‏ - openaccess.thecvf.com
Synthesizing expression is essential to create realistic talking faces. Previous works
consider expressions and mouth shapes as a whole and predict them solely from audio …

Dancing to music

HY Lee, X Yang, MY Liu, TC Wang… - Advances in neural …, 2019‏ - proceedings.neurips.cc
Dancing to music is an instinctive move by humans. Learning to model the music-to-dance
generation process is, however, a challenging problem. It requires significant efforts to …

Single motion diffusion

S Raab, I Leibovitch, G Tevet, M Arar… - arxiv preprint arxiv …, 2023‏ - arxiv.org
Synthesizing realistic animations of humans, animals, and even imaginary creatures, has
long been a goal for artists and computer graphics professionals. Compared to the imaging …

Skeleton-aware networks for deep motion retargeting

K Aberman, P Li, D Lischinski… - ACM Transactions on …, 2020‏ - dl.acm.org
We introduce a novel deep learning framework for data-driven motion retargeting between
skeletons, which may have different structure, yet corresponding to homeomorphic graphs …

Unpaired motion style transfer from video to animation

K Aberman, Y Weng, D Lischinski, D Cohen-Or… - ACM Transactions on …, 2020‏ - dl.acm.org
Transferring the motion style from one animation clip to another, while preserving the motion
content of the latter, has been a long-standing problem in character animation. Most existing …