- Academic Search

YM Assael, B Shillingford, S Whiteson… - arxiv preprint arxiv …, 2016 - arxiv.org

Lipreading is the task of decoding text from the movement of a speaker's mouth. Traditional
approaches separated the problem into two stages: designing or learning visual features …

Simpan Kutip Dirujuk 506 kali Artikel terkait 6 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Large-scale visual speech recognition

B Shillingford, Y Assael, MW Hoffman, T Paine… - arxiv preprint arxiv …, 2018 - arxiv.org

This work presents a scalable solution to open-vocabulary visual speech recognition. To
achieve this, we constructed the largest existing visual speech recognition dataset …

Simpan Kutip Dirujuk 212 kali Artikel terkait 7 versi Versi HTML

Comparing fusion models for DNN-based audiovisual continuous speech recognition

AH Abdelaziz - IEEE/ACM Transactions on Audio, Speech, and …, 2017 - ieeexplore.ieee.org

Audiovisual fusion is one of the most challenging tasks that continues to attract substantial
research interest in the field of audiovisual automatic speech recognition (AV-ASR). In the …

Simpan Kutip Dirujuk 50 kali Artikel terkait 2 versi

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Pseudo-convolutional policy gradient for sequence-to-sequence lip-reading

M Luo, S Yang, S Shan, X Chen - 2020 15th IEEE International …, 2020 - ieeexplore.ieee.org

Lip-reading aims to infer the speech content from the lip movement sequence and can be
seen as a typical sequence-to-sequence (seq2seq) problem which translates the input …

Simpan Kutip Dirujuk 53 kali Artikel terkait 7 versi

[Free GPT-4]
[DeepSeek]

[PDF] nsf.gov

Gating neural network for large vocabulary audiovisual speech recognition

F Tao, C Busso - IEEE/ACM Transactions on Audio, Speech …, 2018 - ieeexplore.ieee.org

Audio-based automatic speech recognition (A-ASR) systems are affected by noisy
conditions in real-world applications. Adding visual cues to the ASR system is an appealing …

Simpan Kutip Dirujuk 66 kali Artikel terkait 4 versi

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Improving speaker-independent lipreading with domain-adversarial training

M Wand, J Schmidhuber - arxiv preprint arxiv:1708.01565, 2017 - arxiv.org

We present a Lipreading system, ie a speech recognition system using only visual features,
which uses domain-adversarial training for speaker independence. Domain-adversarial …

Simpan Kutip Dirujuk 68 kali Artikel terkait 6 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Investigations on end-to-end audiovisual fusion

M Wand, J Schmidhuber, NT Vu - 2018 IEEE International …, 2018 - ieeexplore.ieee.org

Audiovisual speech recognition (AVSR) is a method to alleviate the adverse effect of noise
in the acoustic signal. Leveraging recent developments in deep neural network-based …

Simpan Kutip Dirujuk 44 kali Artikel terkait 6 versi

[Free GPT-4]
[DeepSeek]

[PDF] nsf.gov

Aligning audiovisual features for audiovisual speech recognition

F Tao, C Busso - … Conference on Multimedia and Expo (ICME), 2018 - ieeexplore.ieee.org

Visual information can improve the performance of automatic speech recognition (ASR),
especially in the presence of background noise or different speech modes. A key problem is …

Simpan Kutip Dirujuk 34 kali Artikel terkait 7 versi

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

RETRACTED ARTICLE: Application of deep learning in Mandarin Chinese lip-reading recognition

G **ng, L Han, Y Zheng, M Zhao - EURASIP Journal on Wireless …, 2023 - Springer

Lip-reading is an emerging technology in recent years, and it can be applied to the field of
language recovery, criminal investigation, identity authentication, etc. We aim to recognize …

Simpan Kutip Dirujuk 4 kali Artikel terkait 8 versi

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

LipReading with 3D-2D-CNN BLSTM-HMM and word-CTC models

DK Margam, R Aralikatti, T Sharma, A Thanda… - arxiv preprint arxiv …, 2019 - arxiv.org

In recent years, deep learning based machine lipreading has gained prominence. To this
end, several architectures such as LipNet, LCANet and others have been proposed which …

Simpan Kutip Dirujuk 20 kali Artikel terkait 3 versi Versi HTML

Buat notifikasi

Kutip

Penelusuran lanjutan

Disimpan ke Koleksi saya

Dynamic Stream Weighting for Turbo-Decoding-Based Audiovisual ASR.

Lipnet: End-to-end sentence-level lipreading

Large-scale visual speech recognition

Comparing fusion models for DNN-based audiovisual continuous speech recognition

Pseudo-convolutional policy gradient for sequence-to-sequence lip-reading

Gating neural network for large vocabulary audiovisual speech recognition

Improving speaker-independent lipreading with domain-adversarial training

Investigations on end-to-end audiovisual fusion

Aligning audiovisual features for audiovisual speech recognition

RETRACTED ARTICLE: Application of deep learning in Mandarin Chinese lip-reading recognition

LipReading with 3D-2D-CNN BLSTM-HMM and word-CTC models