Google Učenjak

A Borji - Computer Vision and Image Understanding, 2022 - Elsevier

This work is an update of my previous paper on the same topic published a few years ago
(Borji, 2019). With the dramatic progress in generative modeling, a suite of new quantitative …

Shrani Navedi Navedeno v 347 virih Sorodni članki Vse različice: 6

[免费ChatGPT] [DeepSeek可用网址] [PDF] springer.com

Deep audio-visual learning: A survey

H Zhu, MD Luo, R Wang, AH Zheng, R He - International Journal of …, 2021 - Springer

Audio-visual learning, aimed at exploiting the relationship between audio and visual
modalities, has drawn considerable attention since deep learning started to be used …

Shrani Navedi Navedeno v 192 virih Sorodni članki Vse različice: 11

[免费ChatGPT] [DeepSeek可用网址] [PDF] neurips.cc

Vasa-1: Lifelike audio-driven talking faces generated in real time

S Xu, G Chen, YX Guo, J Yang, C Li… - Advances in …, 2025 - proceedings.neurips.cc

We introduce VASA, a framework for generating lifelike talking faces with appealing visual
affective skills (VAS) given a single static image and a speech audio clip. Our premiere …

Shrani Navedi Navedeno v 64 virih Sorodni članki Vse različice: 5 V obliki HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] thecvf.com

Codetalker: Speech-driven 3d facial animation with discrete motion prior

J **ng, M **a, Y Zhang, X Cun… - Proceedings of the …, 2023 - openaccess.thecvf.com

Speech-driven 3D facial animation has been widely studied, yet there is still a gap to
achieving realism and vividness due to the highly ill-posed nature and scarcity of audio …

Shrani Navedi Navedeno v 156 virih Sorodni članki Vse različice: 11 V obliki HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

Learning audio-visual speech representation by masked multimodal cluster prediction

B Shi, WN Hsu, K Lakhotia, A Mohamed - arxiv preprint arxiv:2201.02184, 2022 - arxiv.org

Video recordings of speech contain correlated audio and visual information, providing a
strong signal for speech representation learning from the speaker's lip movements and the …

Shrani Navedi Navedeno v 335 virih Sorodni članki Vse različice: 3 V obliki HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] thecvf.com

Pose-controllable talking face generation by implicitly modularized audio-visual representation

H Zhou, Y Sun, W Wu, CC Loy… - Proceedings of the …, 2021 - openaccess.thecvf.com

While accurate lip synchronization has been achieved for arbitrary-subject audio-driven
talking face generation, the problem of how to efficiently drive the head pose remains …

Shrani Navedi Navedeno v 406 virih Sorodni članki Vse različice: 10 V obliki HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] thecvf.com

Flow-guided one-shot talking face generation with a high-resolution audio-visual dataset

Z Zhang, L Li, Y Ding, C Fan - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com

One-shot talking face generation should synthesize high visual quality facial videos with
reasonable animations of expression and head pose, and just utilize arbitrary driving audio …

Shrani Navedi Navedeno v 327 virih Sorodni članki Vse različice: 5 V obliki HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

A lip sync expert is all you need for speech to lip generation in the wild

KR Prajwal, R Mukhopadhyay, VP Namboodiri… - Proceedings of the 28th …, 2020 - dl.acm.org

In this work, we investigate the problem of lip-syncing a talking face video of an arbitrary
identity to match a target speech segment. Current works excel at producing accurate lip …

Shrani Navedi Navedeno v 821 virih Sorodni članki Vse različice: 16

[免费ChatGPT] [DeepSeek可用网址] [PDF] thecvf.com

Stylesync: High-fidelity generalized and personalized lip sync in style-based generator

J Guan, Z Zhang, H Zhou, T Hu… - Proceedings of the …, 2023 - openaccess.thecvf.com

Despite recent advances in syncing lip movements with any audio waves, current methods
still struggle to balance generation quality and the model's generalization ability. Previous …

Shrani Navedi Navedeno v 59 virih Sorodni članki Vse različice: 6 V obliki HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] thecvf.com

Expressive talking head generation with granular audio-visual control

B Liang, Y Pan, Z Guo, H Zhou… - Proceedings of the …, 2022 - openaccess.thecvf.com

Generating expressive talking heads is essential for creating virtual humans. However,
existing one-or few-shot methods focus on lip-sync and head motion, ignoring the emotional …

Shrani Navedi Navedeno v 136 virih Sorodni članki Vse različice: 4 V obliki HTML

Ustvari opozorilo

Navedi

Napredno iskanje

Shranjeno v Mojo knjižnico

Lip movements generation at a glance

Pros and cons of GAN evaluation measures: New developments

Deep audio-visual learning: A survey

Vasa-1: Lifelike audio-driven talking faces generated in real time

Codetalker: Speech-driven 3d facial animation with discrete motion prior

Learning audio-visual speech representation by masked multimodal cluster prediction

Pose-controllable talking face generation by implicitly modularized audio-visual representation

Flow-guided one-shot talking face generation with a high-resolution audio-visual dataset

A lip sync expert is all you need for speech to lip generation in the wild

Stylesync: High-fidelity generalized and personalized lip sync in style-based generator

Expressive talking head generation with granular audio-visual control