[PDF][PDF] Audio lottery: Speech recognition made ultra-lightweight, noise-robust, and transferable

S Ding, T Chen, Z Wang - International Conference on Learning …, 2022 - par.nsf.gov
Lightweight speech recognition models have seen explosive demands owing to a growing
amount of speech-interactive features on mobile devices. Since designing such systems …

Learning noise-invariant representations for robust speech recognition

D Liang, Z Huang, ZC Lipton - 2018 IEEE Spoken Language …, 2018 - ieeexplore.ieee.org
Despite rapid advances in speech recognition, current models remain brittle to superficial
perturbations to their inputs. Small amounts of noise can destroy the performance of an …

One-shot pruning of recurrent neural networks by jacobian spectrum evaluation

MS Zhang, B Stadie - arxiv preprint arxiv:1912.00120, 2019 - arxiv.org
Recent advances in the sparse neural network literature have made it possible to prune
many large feed forward and convolutional networks with only a small quantity of data. Yet …

[PDF][PDF] Scalable deep neural networks via low-rank matrix factorization

A Yaguchi, T Suzuki, S Nitta, Y Sakata… - arxiv preprint arxiv …, 2019 - researchgate.net
Compressing deep neural networks (DNNs) is important for real-world applications
operating on resource-constrained devices. However, it is difficult to change the model size …

Spectral pruning for recurrent neural networks

T Furuya, K Suetake, K Taniguchi… - International …, 2022 - proceedings.mlr.press
Recurrent neural networks (RNNs) are a class of neural networks used in sequential tasks.
However, in general, RNNs have a large number of parameters and involve enormous …

Spectral pruning for recurrent neural networks

T Furuya, K Suetake, K Taniguchi, H Kusumoto… - arxiv preprint arxiv …, 2021 - arxiv.org
Recurrent neural networks (RNNs) are a class of neural networks used in sequential tasks.
However, in general, RNNs have a large number of parameters and involve enormous …

Closing the Gap between Classification and Retrieval Models

A Taha - 2021 - search.proquest.com
Retrieval networks learn a feature embedding where similar samples are close together,
and different samples are far apart. This feature embedding is essential for computer vision …

SVMax: A Feature Embedding Regularizer

A Taha, A Hanson, A Shrivastava, L Davis - arxiv preprint arxiv …, 2021 - arxiv.org
A neural network regularizer (eg, weight decay) boosts performance by explicitly penalizing
the complexity of a network. In this paper, we penalize inferior network activations--feature …

On-Device End-to-end Speech Recognition with Multi-Step Parallel Rnns

Y Boo, J Park, L Lee, W Sung - 2018 IEEE Spoken Language …, 2018 - ieeexplore.ieee.org
Most of the current automatic speech recognition is performed on a remote server. However,
the demand for speech recognition on personal devices is increasing, owing to the …