Deep learning for visual speech analysis: A survey

C Sheng, G Kuang, L Bai, C Hou, Y Guo… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
Visual speech, referring to the visual domain of speech, has attracted increasing attention
due to its wide applications, such as public security, medical treatment, military defense, and …

A review of recent advances in visual speech decoding

Z Zhou, G Zhao, X Hong, M Pietikäinen - Image and vision computing, 2014 - Elsevier
Visual speech information plays an important role in automatic speech recognition (ASR)
especially when audio is corrupted or even inaccessible. Despite the success of audio …

Speaking in shorthand–A syllable-centric perspective for understanding pronunciation variation

S Greenberg - Speech Communication, 1999 - Elsevier
Current-generation automatic speech recognition (ASR) systems model spoken discourse
as a quasi-linear sequence of words and phones. Because it is unusual for every phone …

Operator-valued kernels for learning from functional response data

H Kadri, E Duflos, P Preux, S Canu… - Journal of Machine …, 2016 - jmlr.org
In this paper we consider the problems of supervised classification and regression in the
case where attributes and labels are functions: a data is represented by a set of functions …

The SuperSID project: Exploiting high-level information for high-accuracy speaker recognition

D Reynolds, W Andrews, J Campbell… - … , Speech, and Signal …, 2003 - ieeexplore.ieee.org
The area of automatic speaker recognition has been dominated by systems using only short-
term, low-level acoustic information, such as cepstral features. While these systems have …

Modeling coarticulation in EMG-based continuous speech recognition

T Schultz, M Wand - Speech Communication, 2010 - Elsevier
This paper discusses the use of surface electromyography for automatic speech recognition.
Electromyographic signals captured at the facial muscles record the activity of the human …

Speech production knowledge in automatic speech recognition

S King, J Frankel, K Livescu, E McDermott… - The Journal of the …, 2007 - pubs.aip.org
Although much is known about how speech is produced, and research into speech
production has resulted in measured articulatory data, feature systems of different kinds, and …

Detection of phonological features in continuous speech using neural networks

S King, P Taylor - Computer Speech & Language, 2000 - Elsevier
We report work on the first component of a two-stage speech recognition architecture based
onphonological features rather than phones. This paper reports experiments on three …

[BOG][B] Biometric authentication: a machine learning approach

SY Kung, MW Mak, SH Lin, MW Mak, S Lin - 2005 - eie.polyu.edu.hk
Gaussian Mixture Models (GMMs) and Radial Basis Function (RBF) networks are two of the
promising neural models for pattern classification. In this laboratory exercise, your task is to …

Combining acoustic and articulatory feature information for robust speech recognition

K Kirchhoff, GA Fink, G Sagerer - Speech Communication, 2002 - Elsevier
The idea of using articulatory representations for automatic speech recognition (ASR)
continues to attract much attention in the speech community. Representations which are …