A review of deep learning techniques for speech processing
The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …
learning. The use of multiple processing layers has enabled the creation of models capable …
Self-supervised speech representation learning: A review
Although supervised deep learning has revolutionized speech and audio processing, it has
necessitated the building of specialist models for individual tasks and application scenarios …
necessitated the building of specialist models for individual tasks and application scenarios …
Hubert: Self-supervised speech representation learning by masked prediction of hidden units
Self-supervised approaches for speech representation learning are challenged by three
unique problems:(1) there are multiple sound units in each input utterance,(2) there is no …
unique problems:(1) there are multiple sound units in each input utterance,(2) there is no …
[HTML][HTML] Deep speech 2: End-to-end speech recognition in english and mandarin
We show that an end-to-end deep learning approach can be used to recognize either
English or Mandarin Chinese speech–two vastly different languages. Because it replaces …
English or Mandarin Chinese speech–two vastly different languages. Because it replaces …
Deep learning in neural networks: An overview
J Schmidhuber - Neural networks, 2015 - Elsevier
In recent years, deep artificial neural networks (including recurrent ones) have won
numerous contests in pattern recognition and machine learning. This historical survey …
numerous contests in pattern recognition and machine learning. This historical survey …
Understanding LSTM--a tutorial into long short-term memory recurrent neural networks
RC Staudemeyer, ER Morris - arxiv preprint arxiv:1909.09586, 2019 - arxiv.org
Long Short-Term Memory Recurrent Neural Networks (LSTM-RNN) are one of the most
powerful dynamic classifiers publicly known. The network itself and the related learning …
powerful dynamic classifiers publicly known. The network itself and the related learning …
Speech recognition with deep recurrent neural networks
Recurrent neural networks (RNNs) are a powerful model for sequential data. End-to-end
training methods such as Connectionist Temporal Classification make it possible to train …
training methods such as Connectionist Temporal Classification make it possible to train …
[PDF][PDF] Long short-term memory recurrent neural network architectures for large scale acoustic modeling
Abstract Long Short-Term Memory (LSTM) is a specific recurrent neural network (RNN)
architecture that was designed to model temporal sequences and their long-range …
architecture that was designed to model temporal sequences and their long-range …
Deep learning: methods and applications
This monograph provides an overview of general deep learning methodology and its
applications to a variety of signal and information processing tasks. The application areas …
applications to a variety of signal and information processing tasks. The application areas …
Convolutional neural networks for speech recognition
Recently, the hybrid deep neural network (DNN)-hidden Markov model (HMM) has been
shown to significantly improve speech recognition performance over the conventional …
shown to significantly improve speech recognition performance over the conventional …