[PDF][PDF] Recent advances in end-to-end automatic speech recognition

J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com
Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …

End-to-end speech recognition: A survey

R Prabhavalkar, T Hori, TN Sainath… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org
In the last decade of automatic speech recognition (ASR) research, the introduction of deep
learning has brought considerable reductions in word error rate of more than 50% relative …

Visual speech recognition for multiple languages in the wild

P Ma, S Petridis, M Pantic - Nature Machine Intelligence, 2022 - nature.com
Visual speech recognition (VSR) aims to recognize the content of speech based on lip
movements, without relying on the audio stream. Advances in deep learning and the …

Specaugment: A simple data augmentation method for automatic speech recognition

DS Park, W Chan, Y Zhang, CC Chiu, B Zoph… - arxiv preprint arxiv …, 2019 - arxiv.org
We present SpecAugment, a simple data augmentation method for speech recognition.
SpecAugment is applied directly to the feature inputs of a neural network (ie, filter bank …

Recent advances in embedding methods for multi-object tracking: a survey

G Wang, M Song, JN Hwang - arxiv preprint arxiv:2205.10766, 2022 - arxiv.org
Multi-object tracking (MOT) aims to associate target objects across video frames in order to
obtain entire moving trajectories. With the advancement of deep neural networks and the …

Intermediate loss regularization for ctc-based speech recognition

J Lee, S Watanabe - ICASSP 2021-2021 IEEE International …, 2021 - ieeexplore.ieee.org
We present a simple and efficient auxiliary loss function for automatic speech recognition
(ASR) based on the connectionist temporal classification (CTC) objective. The proposed …

Improved training of end-to-end attention models for speech recognition

A Zeyer, K Irie, R Schlüter, H Ney - arxiv preprint arxiv:1805.03294, 2018 - arxiv.org
Sequence-to-sequence attention-based models on subword units allow simple open-
vocabulary end-to-end speech recognition. In this work, we show that such models can …

Auxiliary tasks benefit 3d skeleton-based human motion prediction

C Xu, RT Tan, Y Tan, S Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com
Exploring spatial-temporal dependencies from observed motions is one of the core
challenges of human motion prediction. Previous methods mainly focus on dedicated …

Self-supervised generalisation with meta auxiliary learning

S Liu, A Davison, E Johns - Advances in Neural Information …, 2019 - proceedings.neurips.cc
Learning with auxiliary tasks can improve the ability of a primary task to generalise.
However, this comes at the cost of manually labelling auxiliary data. We propose a new …

Transfer learning

SJ Pan - Learning, 2020 - api.taylorfrancis.com
Supervised machine learning techniques have already been widely studied and applied to
various real-world applications. However, most existing supervised algorithms work well …