- Academic Search

B Shillingford, Y Assael, MW Hoffman, T Paine… - arxiv preprint arxiv …, 2018 - arxiv.org

This work presents a scalable solution to open-vocabulary visual speech recognition. To
achieve this, we constructed the largest existing visual speech recognition dataset …

Zapisz Cytuj Cytowane przez 213 Powiązane artykuły Wszystkie wersje 7 Wersja HTML

[Free GPT-4]

[PDF] academia.edu

Extraction of visual features for lipreading

I Matthews, TF Cootes, JA Bangham… - … on Pattern Analysis …, 2002 - ieeexplore.ieee.org

The multimodal nature of speech is often ignored in human-computer interaction, but lip
deformations and other body motion, such as those of the head, convey additional …

Zapisz Cytuj Cytowane przez 721 Powiązane artykuły Wszystkie wersje 17

[Free GPT-4]

[PDF] psu.edu

CUAVE: A new audio-visual database for multimodal human-computer interface research

EK Patterson, S Gurbuz, Z Tufekci… - 2002 IEEE International …, 2002 - ieeexplore.ieee.org

Multimodal signal processing has become an important topic of research for overcoming
certain problems of audio-only speech processing. Audio-visual speech recognition is one …

Zapisz Cytuj Cytowane przez 411 Powiązane artykuły Wszystkie wersje 10

[Free GPT-4]

[PDF] psu.edu

Audio-visual integration in multimodal communication

T Chen, RR Rao - Proceedings of the IEEE, 1998 - ieeexplore.ieee.org

We review recent research that examines audio-visual integration in multimodal
communication. The topics include bimodality in human speech, human and automated lip …

Zapisz Cytuj Cytowane przez 449 Powiązane artykuły Wszystkie wersje 17

[Free GPT-4]

[PDF] epfl.ch

[PDF][PDF] Audio visual speech recognition

C Neti, G Potamianos, J Luettin, I Matthews, H Glotin… - 2000 - infoscience.epfl.ch

We have made significant progress in automatic speech recognition ASR for well-defined
applications like dictation and medium vocabulary transaction processing tasks in relatively …

Zapisz Cytuj Cytowane przez 376 Powiązane artykuły Wszystkie wersje 17 Wersja HTML

Audiovisual speech processing

T Chen - IEEE signal processing magazine, 2001 - ieeexplore.ieee.org

We have reported activities in audiovisual speech processing, with emphasis on lip reading
and lip synchronization. These research results have shown that, with lip reading, it is …

Zapisz Cytuj Cytowane przez 368 Powiązane artykuły Wszystkie wersje 5

[Free GPT-4]

[PDF] psu.edu

An image transform approach for HMM based automatic lipreading

G Potamianos, HP Graf… - … Conference on Image …, 1998 - ieeexplore.ieee.org

This paper concentrates on the visual front end for hidden Markov model based automatic
lipreading. Two approaches for extracting features relevant to lipreading, given image …

Zapisz Cytuj Cytowane przez 295 Powiązane artykuły Wszystkie wersje 9

[Free GPT-4]

[PDF] psu.edu

Audiovisual information fusion in human–computer interfaces and intelligent environments: A survey

ST Shivappa, MM Trivedi, BD Rao - Proceedings of the IEEE, 2010 - ieeexplore.ieee.org

Microphones and cameras have been extensively used to observe and detect human
activity and to facilitate natural modes of interaction between humans and intelligent …

Zapisz Cytuj Cytowane przez 165 Powiązane artykuły Wszystkie wersje 9

[Free GPT-4]

[PDF] springer.com Full View

Moving-talker, speaker-independent feature study, and baseline results using the CUAVE multimodal speech corpus

EK Patterson, S Gurbuz, Z Tufekci… - EURASIP Journal on …, 2002 - Springer

Strides in computer technology and the search for deeper, more powerful techniques in
signal processing have brought multimodal research to the forefront in recent years. Audio …

Zapisz Cytuj Cytowane przez 179 Powiązane artykuły Wszystkie wersje 16

[Free GPT-4]

[PDF] arxiv.org

Multiresolution and multimodal speech recognition with transformers

G Paraskevopoulos, S Parthasarathy, A Khare… - arxiv preprint arxiv …, 2020 - arxiv.org

This paper presents an audio visual automatic speech recognition (AV-ASR) system using a
Transformer-based architecture. We particularly focus on the scene context provided by the …

Zapisz Cytuj Cytowane przez 49 Powiązane artykuły Wszystkie wersje 6 Wersja HTML

Utwórz alert

Cytuj

Szukanie zaawansowane

Zapisano w Mojej bibliotece

Speaker independent audio-visual database for bimodal ASR

Large-scale visual speech recognition

Extraction of visual features for lipreading

CUAVE: A new audio-visual database for multimodal human-computer interface research

Audio-visual integration in multimodal communication

[PDF][PDF] Audio visual speech recognition

Audiovisual speech processing

An image transform approach for HMM based automatic lipreading

Audiovisual information fusion in human–computer interfaces and intelligent environments: A survey

Moving-talker, speaker-independent feature study, and baseline results using the CUAVE multimodal speech corpus

Multiresolution and multimodal speech recognition with transformers