Google Academic

Z Peng, H Wu, Z Song, H Xu, X Zhu… - Proceedings of the …, 2023 - openaccess.thecvf.com

Speech-driven 3D face animation aims to generate realistic facial expressions that match
the speech content and emotion. However, existing methods often neglect emotional facial …

Salvați Citați Citat de 100 ori Articole cu conținut similar Toate cele 5 versiuni Afișare ca HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

Learning audio-visual speech representation by masked multimodal cluster prediction

B Shi, WN Hsu, K Lakhotia, A Mohamed - arxiv preprint arxiv:2201.02184, 2022 - arxiv.org

Video recordings of speech contain correlated audio and visual information, providing a
strong signal for speech representation learning from the speaker's lip movements and the …

Salvați Citați Citat de 335 ori Articole cu conținut similar Toate cele 3 versiuni Afișare ca HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] thecvf.com

Faceformer: Speech-driven 3d facial animation with transformers

Y Fan, Z Lin, J Saito, W Wang… - Proceedings of the …, 2022 - openaccess.thecvf.com

Speech-driven 3D facial animation is challenging due to the complex geometry of human
faces and the limited availability of 3D audio-visual data. Prior works typically focus on …

Salvați Citați Citat de 240 ori Articole cu conținut similar Toate cele 7 versiuni Afișare ca HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] thecvf.com

Diffsheg: A diffusion-based approach for real-time speech-driven holistic 3d expression and gesture generation

J Chen, Y Liu, J Wang, A Zeng, Y Li… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract We propose DiffSHEG a Diffusion-based approach for Speech-driven Holistic 3D
Expression and Gesture generation. While previous works focused on co-speech gesture or …

Salvați Citați Citat de 23 ori Articole cu conținut similar Toate cele 8 versiuni Afișare ca HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] thecvf.com

Lipsync3d: Data-efficient learning of personalized 3d talking faces from video using pose and lighting normalization

A Lahiri, V Kwatra, C Frueh, J Lewis… - Proceedings of the …, 2021 - openaccess.thecvf.com

In this paper, we present a video-based learning framework for animating personalized 3D
talking faces from audio. We introduce two training-time data normalizations that significantly …

Salvați Citați Citat de 113 ori Articole cu conținut similar Toate cele 7 versiuni Afișare ca HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] thecvf.com

Language-guided music recommendation for video via prompt analogies

D McKee, J Salamon, J Sivic… - Proceedings of the …, 2023 - openaccess.thecvf.com

We propose a method to recommend music for an input video while allowing a user to guide
music selection with free-form natural language. A key challenge of this problem setting is …

Salvați Citați Citat de 24 ori Articole cu conținut similar Toate cele 5 versiuni Afișare ca HTML

[免费ChatGPT] [DeepSeek可用网址] [HTML] mdpi.com

[HTML][HTML] Audio-Driven Facial Animation with Deep Learning: A Survey

D Jiang, J Chang, L You, S Bian, R Kosk, G Maguire - Information, 2024 - mdpi.com

Audio-driven facial animation is a rapidly evolving field that aims to generate realistic facial
expressions and lip movements synchronized with a given audio input. This survey provides …

Salvați Citați Citat de 3 ori Articole cu conținut similar Toate cele 8 versiuni În cache

[免费ChatGPT] [DeepSeek可用网址] [PDF] thecvf.com

Learnable irrelevant modality dropout for multimodal action recognition on modality-specific annotated videos

S Alfasly, J Lu, C Xu, Y Zou - Proceedings of the IEEE/CVF …, 2022 - openaccess.thecvf.com

With the assumption that a video dataset is multimodality annotated in which auditory and
visual modalities both are labeled or class-relevant, current multimodal methods apply …

Salvați Citați Citat de 27 ori Articole cu conținut similar Toate cele 6 versiuni Afișare ca HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] thecvf.com

Missing modality robustness in semi-supervised multi-modal semantic segmentation

H Maheshwari, YC Liu, Z Kira - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Using multiple spatial modalities has been proven helpful in improving semantic
segmentation performance. However, there are several real-world challenges that have yet …

Salvați Citați Citat de 14 ori Articole cu conținut similar Toate cele 6 versiuni Afișare ca HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] thecvf.com

Laughtalk: Expressive 3d talking head generation with laughter

K Sung-Bin, L Hyun, DH Hong… - Proceedings of the …, 2024 - openaccess.thecvf.com

Laughter is a unique expression, essential to affirmative social interactions of humans.
Although current 3D talking head generation methods produce convincing verbal …

Salvați Citați Citat de 11 ori Articole cu conținut similar Toate cele 8 versiuni Afișare ca HTML

Creează alerta

Citați

Căutare avansată

Salvat în Bibliotecă

Modality dropout for improved performance-driven talking faces

Emotalk: Speech-driven emotional disentanglement for 3d face animation

Learning audio-visual speech representation by masked multimodal cluster prediction

Faceformer: Speech-driven 3d facial animation with transformers

Diffsheg: A diffusion-based approach for real-time speech-driven holistic 3d expression and gesture generation

Lipsync3d: Data-efficient learning of personalized 3d talking faces from video using pose and lighting normalization

Language-guided music recommendation for video via prompt analogies

[HTML][HTML] Audio-Driven Facial Animation with Deep Learning: A Survey

Learnable irrelevant modality dropout for multimodal action recognition on modality-specific annotated videos

Missing modality robustness in semi-supervised multi-modal semantic segmentation

Laughtalk: Expressive 3d talking head generation with laughter