[PDF][PDF] Recent advances in end-to-end automatic speech recognition
J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com
Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …
End-to-end speech recognition: A survey
In the last decade of automatic speech recognition (ASR) research, the introduction of deep
learning has brought considerable reductions in word error rate of more than 50% relative …
learning has brought considerable reductions in word error rate of more than 50% relative …
Visual speech recognition for multiple languages in the wild
Visual speech recognition (VSR) aims to recognize the content of speech based on lip
movements, without relying on the audio stream. Advances in deep learning and the …
movements, without relying on the audio stream. Advances in deep learning and the …
Specaugment: A simple data augmentation method for automatic speech recognition
We present SpecAugment, a simple data augmentation method for speech recognition.
SpecAugment is applied directly to the feature inputs of a neural network (ie, filter bank …
SpecAugment is applied directly to the feature inputs of a neural network (ie, filter bank …
Recent advances in embedding methods for multi-object tracking: a survey
Multi-object tracking (MOT) aims to associate target objects across video frames in order to
obtain entire moving trajectories. With the advancement of deep neural networks and the …
obtain entire moving trajectories. With the advancement of deep neural networks and the …
Intermediate loss regularization for ctc-based speech recognition
We present a simple and efficient auxiliary loss function for automatic speech recognition
(ASR) based on the connectionist temporal classification (CTC) objective. The proposed …
(ASR) based on the connectionist temporal classification (CTC) objective. The proposed …
Improved training of end-to-end attention models for speech recognition
Sequence-to-sequence attention-based models on subword units allow simple open-
vocabulary end-to-end speech recognition. In this work, we show that such models can …
vocabulary end-to-end speech recognition. In this work, we show that such models can …
Auxiliary tasks benefit 3d skeleton-based human motion prediction
Exploring spatial-temporal dependencies from observed motions is one of the core
challenges of human motion prediction. Previous methods mainly focus on dedicated …
challenges of human motion prediction. Previous methods mainly focus on dedicated …
Self-supervised generalisation with meta auxiliary learning
Learning with auxiliary tasks can improve the ability of a primary task to generalise.
However, this comes at the cost of manually labelling auxiliary data. We propose a new …
However, this comes at the cost of manually labelling auxiliary data. We propose a new …
Transfer learning
SJ Pan - Learning, 2020 - api.taylorfrancis.com
Supervised machine learning techniques have already been widely studied and applied to
various real-world applications. However, most existing supervised algorithms work well …
various real-world applications. However, most existing supervised algorithms work well …