An empirical survey of data augmentation for time series classification with neural networks

BK Iwana, S Uchida - Plos one, 2021 - journals.plos.org
In recent times, deep artificial neural networks have achieved many successes in pattern
recognition. Part of this success can be attributed to the reliance on big data to increase …

Towards end-to-end speech recognition with recurrent neural networks

A Graves, N Jaitly - International conference on machine …, 2014 - proceedings.mlr.press
This paper presents a speech recognition system that directly transcribes audio data with
text, without requiring an intermediate phonetic representation. The system is based on a …

A review of recent advances in visual speech decoding

Z Zhou, G Zhao, X Hong, M Pietikäinen - Image and vision computing, 2014 - Elsevier
Visual speech information plays an important role in automatic speech recognition (ASR)
especially when audio is corrupted or even inaccessible. Despite the success of audio …

Data augmentation for deep neural network acoustic modeling

X Cui, V Goel, B Kingsbury - IEEE/ACM Transactions on Audio …, 2015 - ieeexplore.ieee.org
This paper investigates data augmentation for deep neural network acoustic modeling
based on label-preserving transformations to deal with data sparsity. Two data …

Large-vocabulary continuous speech recognition systems: A look at some recent advances

G Saon, JT Chien - IEEE signal processing magazine, 2012 - ieeexplore.ieee.org
Over the past decade or so, several advances have been made to the design of modern
large vocabulary continuous speech recognition (LVCSR) systems to the point where their …

[PDF][PDF] Vocal tract length perturbation (VTLP) improves speech recognition

N Jaitly, GE Hinton - Proc. ICML workshop on deep learning for …, 2013 - cs.utoronto.ca
Augmenting datasets by transforming inputs in a way that does not change the label is a
crucial ingredient of the state of the art methods for object recognition using neural networks …

[BOOK][B] Speech synthesis and recognition

W Holmes - 2002 - taylorfrancis.com
With the growing impact of information technology on daily life, speech is becoming
increasingly important for providing a natural means of communication between humans …

Lung sounds classification using convolutional neural networks

D Bardou, K Zhang, SM Ahmad - Artificial intelligence in medicine, 2018 - Elsevier
Lung sounds convey relevant information related to pulmonary disorders, and to evaluate
patients with pulmonary conditions, the physician or the doctor uses the traditional …

Child speech recognition in human-robot interaction: evaluations and recommendations

J Kennedy, S Lemaignan, C Montassier… - Proceedings of the …, 2017 - dl.acm.org
An increasing number of human-robot interaction (HRI) studies are now taking place in
applied settings with children. These interactions often hinge on verbal interaction to …

Speaker anonymisation using the McAdams coefficient

J Patino, N Tomashenko, M Todisco, A Nautsch… - arxiv preprint arxiv …, 2020 - arxiv.org
Anonymisation has the goal of manipulating speech signals in order to degrade the
reliability of automatic approaches to speaker recognition, while preserving other aspects of …