- Academic Search

E Battenberg, J Chen, R Child, A Coates… - 2017 IEEE automatic …, 2017 - ieeexplore.ieee.org

In this work, we perform an empirical comparison among the CTC, RNN-Transducer, and
attention-based Seq2Seq models for end-to-end speech recognition. We show that, without …

Zapisz Cytuj Cytowane przez 263 Powiązane artykuły Wszystkie wersje 7

[Free GPT-4]

[PDF] ieee.org

Per-channel energy normalization: Why and how

V Lostanlen, J Salamon, M Cartwright… - IEEE Signal …, 2018 - ieeexplore.ieee.org

In the context of automatic speech recognition and acoustic event detection, an adaptive
procedure named per-channel energy normalization (PCEN) has recently shown to …

Zapisz Cytuj Cytowane przez 117 Powiązane artykuły Wszystkie wersje 16

[Free GPT-4]

[PDF] arxiv.org

Improved training for online end-to-end speech recognition systems

S Kim, ML Seltzer, J Li, R Zhao - arxiv preprint arxiv:1711.02212, 2017 - arxiv.org

Achieving high accuracy with end-to-end speech recognizers requires careful parameter
initialization prior to training. Otherwise, the networks may fail to find a good local optimum …

Zapisz Cytuj Cytowane przez 51 Powiązane artykuły Wszystkie wersje 9 Wersja HTML

[Free GPT-4]

[PDF] academia.edu

[PDF][PDF] Forget a Bit to Learn Better: Soft Forgetting for CTC-Based Automatic Speech Recognition.

K Audhkhasi, G Saon, Z Tüske, B Kingsbury… - Interspeech, 2019 - academia.edu

Prior work has shown that connectionist temporal classification (CTC)-based automatic
speech recognition systems perform well when using bidirectional long short-term memory …

Zapisz Cytuj Cytowane przez 34 Powiązane artykuły Wszystkie wersje 6 Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

Learning to detect dysarthria from raw speech

J Millet, N Zeghidour - ICASSP 2019-2019 IEEE International …, 2019 - ieeexplore.ieee.org

Speech classifiers of paralinguistic traits traditionally learn from diverse hand-crafted low-
level features, by selecting the relevant information for the task at hand. We explore an …

Zapisz Cytuj Cytowane przez 45 Powiązane artykuły Wszystkie wersje 5

[Free GPT-4]

[PDF] arxiv.org

On front-end gain invariant modeling for wake word spotting

Y Gao, ND Stein, CC Kao, Y Cai, M Sun… - arxiv preprint arxiv …, 2020 - arxiv.org

Wake word (WW) spotting is challenging in far-field due to the complexities and variations in
acoustic conditions and the environmental interference in signal transmission. A suite of …

Zapisz Cytuj Cytowane przez 12 Powiązane artykuły Wszystkie wersje 10 Wersja HTML

Improving knowledge distillation of CTC-trained acoustic models with alignment-consistent ensemble and target delay

H Ding, K Chen, Q Huo - IEEE/ACM transactions on audio …, 2020 - ieeexplore.ieee.org

Knowledge distillation (KD) has been widely used to improve the performance of a simpler
student model by imitating the outputs or intermediate representations of a more complex …

Zapisz Cytuj Cytowane przez 8 Powiązane artykuły Wszystkie wersje 3

Acoustic domain mismatch compensation in bird audio detection

T Tang, Y Long, Y Li, J Liang - International Journal of Speech Technology, 2022 - Springer

Detecting bird calls in audio is an important task for automatic wildlife monitoring, as well as
in citizen science and audio library management. This paper presents front-end acoustic …

Zapisz Cytuj Cytowane przez 2 Powiązane artykuły Wszystkie wersje 2

Utwórz alert

Cytuj

Szukanie zaawansowane

Zapisano w Mojej bibliotece

Reducing bias in production speech models

Exploring neural transducers for end-to-end speech recognition

Per-channel energy normalization: Why and how

Improved training for online end-to-end speech recognition systems

[PDF][PDF] Forget a Bit to Learn Better: Soft Forgetting for CTC-Based Automatic Speech Recognition.

Learning to detect dysarthria from raw speech

On front-end gain invariant modeling for wake word spotting

Improving knowledge distillation of CTC-trained acoustic models with alignment-consistent ensemble and target delay

Acoustic domain mismatch compensation in bird audio detection