Google Akademik

D Michelsanti, ZH Tan, SX Zhang, Y Xu… - … on Audio, Speech …, 2021 - ieeexplore.ieee.org

Speech enhancement and speech separation are two related tasks, whose purpose is to
extract either one or more target speech signals, respectively, from a mixture of sounds …

Kaydet Alıntı yap Alıntılanma sayısı: 304 İlgili makaleler 6 sürümün hepsi

[Free GPT-4]

[PDF] springer.com

Deep audio-visual learning: A survey

H Zhu, MD Luo, R Wang, AH Zheng, R He - International Journal of …, 2021 - Springer

Audio-visual learning, aimed at exploiting the relationship between audio and visual
modalities, has drawn considerable attention since deep learning started to be used …

Kaydet Alıntı yap Alıntılanma sayısı: 191 İlgili makaleler 12 sürümün hepsi

Mead: A large-scale audio-visual dataset for emotional talking-face generation

K Wang, Q Wu, L Song, Z Yang, W Wu, C Qian… - … on Computer Vision, 2020 - Springer

The synthesis of natural emotional reactions is an essential criterion in vivid talking-face
video generation. This criterion is nevertheless seldom taken into consideration in previous …

Kaydet Alıntı yap Alıntılanma sayısı: 318 İlgili makaleler 2 sürümün hepsi

[Free GPT-4]

[PDF] arxiv.org

Beat: A large-scale semantic and emotional multi-modal dataset for conversational gestures synthesis

H Liu, Z Zhu, N Iwamoto, Y Peng, Z Li, Y Zhou… - European conference on …, 2022 - Springer

Achieving realistic, vivid, and human-like synthesized conversational gestures conditioned
on multi-modal data is still an unsolved problem due to the lack of available datasets …

Kaydet Alıntı yap Alıntılanma sayısı: 137 İlgili makaleler 7 sürümün hepsi

[Free GPT-4]

[PDF] academia.edu

EchoSpeech: Continuous Silent Speech Recognition on Minimally-obtrusive Eyewear Powered by Acoustic Sensing

R Zhang, K Li, Y Hao, Y Wang, Z Lai… - Proceedings of the …, 2023 - dl.acm.org

We present EchoSpeech, a minimally-obtrusive silent speech interface (SSI) powered by
low-power active acoustic sensing. EchoSpeech uses speakers and microphones mounted …

Kaydet Alıntı yap Alıntılanma sayısı: 28 İlgili makaleler

[Free GPT-4]

[PDF] arxiv.org

EmoTalk3D: high-fidelity free-view synthesis of emotional 3D talking head

Q He, X Ji, Y Gong, Y Lu, Kaydet Alıntı yap Alıntılanma sayısı: 40 İlgili makaleler 3 sürümün hepsi

[Free GPT-4]

[PDF] arxiv.org

Can we read speech beyond the lips? rethinking roi selection for deep visual speech recognition

Y Zhang, S Yang, J **ao, S Shan… - 2020 15th IEEE …, 2020 - ieeexplore.ieee.org

Recent advances in deep learning have heightened interest among researchers in the field
of visual speech recognition (VSR). Currently, most existing methods equate VSR with …

Kaydet Alıntı yap Alıntılanma sayısı: 89 İlgili makaleler 7 sürümün hepsi

[Free GPT-4]

[PDF] mdpi.com

An experimental analysis of deep learning architectures for supervised speech enhancement

SA Nossier, J Wall, M Moniri, C Glackin, N Cannings - Electronics, 2020 - mdpi.com

Recent speech enhancement research has shown that deep learning techniques are very
effective in removing background noise. Many deep neural networks are being proposed …

Kaydet Alıntı yap Alıntılanma sayısı: 52 İlgili makaleler 4 sürümün hepsi Önbellek

Uyarı oluştur

Alıntı yap

Gelişmiş arama

Kitaplığım'a kaydedildi

A corpus of audio-visual Lombard speech with frontal and profile views

An overview of deep-learning-based audio-visual speech enhancement and separation

Deep audio-visual learning: A survey

Mead: A large-scale audio-visual dataset for emotional talking-face generation

Beat: A large-scale semantic and emotional multi-modal dataset for conversational gestures synthesis

EchoSpeech: Continuous Silent Speech Recognition on Minimally-obtrusive Eyewear Powered by Acoustic Sensing

EmoTalk3D: high-fidelity free-view synthesis of emotional 3D talking head

Can we read speech beyond the lips? rethinking roi selection for deep visual speech recognition

An experimental analysis of deep learning architectures for supervised speech enhancement