- Academic Search

Y Lin - Aerospace, 2021 - mdpi.com

In air traffic control (ATC), speech communication with radio transmission is the primary way
to exchange information between the controller and aircrew. A wealth of contextual …

Save Cite Cited by 72 Related articles All 10 versions Free GPT-4 Cached

[Free GPT-4]

[PDF] arxiv.org

From english to more languages: Parameter-efficient model reprogramming for cross-lingual speech recognition

CHH Yang, B Li, Y Zhang, N Chen… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org

In this work, we propose a new parameter-efficient learning framework based on neural
model reprogramming for cross-lingual speech recognition, which can re-purpose well …

Save Cite Cited by 36 Related articles All 2 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Class LM and word map** for contextual biasing in end-to-end ASR

R Huang, O Abdel-Hamid, X Li, G Evermann - arxiv preprint arxiv …, 2020 - arxiv.org

In recent years, all-neural, end-to-end (E2E) ASR systems gained rapid interest in the
speech recognition community. They convert speech input to text units in a single trainable …

Save Cite Cited by 53 Related articles All 8 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] aaai.org

Adversarial meta sampling for multilingual low-resource speech recognition

Y **ao, K Gong, P Zhou, G Zheng, X Liang… - Proceedings of the AAAI …, 2021 - ojs.aaai.org

Low-resource automatic speech recognition (ASR) is challenging, as the low-resource target
language data cannot well train an ASR model. To solve this issue, meta-learning …

Save Cite Cited by 34 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

ATCSpeechNet: A multilingual end-to-end speech recognition framework for air traffic control systems

Y Lin, B Yang, L Li, D Guo, J Zhang, H Chen… - Applied Soft …, 2021 - Elsevier

In this paper, a multilingual end-to-end framework, called ATCSpeechNet, is proposed to
tackle the issue of translating communication speech into human-readable text in air traffic …

Save Cite Cited by 36 Related articles All 4 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

The SLT 2021 children speech recognition challenge: Open datasets, rules and baselines

F Yu, Z Yao, X Wang, K An, L **e, Z Ou… - 2021 IEEE Spoken …, 2021 - ieeexplore.ieee.org

Automatic speech recognition (ASR) has been significantly advanced with the use of deep
learning and big data. How-ever improving robustness, including achieving equally good …

Save Cite Cited by 20 Related articles All 6 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

SeACo-Paraformer: A non-autoregressive ASR system with flexible and effective hotword customization ability

X Shi, Y Yang, Z Li, Y Chen, Z Gao… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org

Hotword customization is one of the concerned issues remained in ASR field-it is of value to
enable users of ASR systems to customize names of entities, persons and other phrases to …

Save Cite Cited by 13 Related articles All 3 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Differentiable allophone graphs for language-universal speech recognition

B Yan, S Dalmia, DR Mortensen, F Metze… - arxiv preprint arxiv …, 2021 - arxiv.org

Building language-universal speech recognition systems entails producing phonological
units of spoken sound that can be shared across languages. While speech annotations at …

Save Cite Cited by 13 Related articles All 8 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Deferred NAM: Low-latency Top-K Context Injection via DeferredContext Encoding for Non-Streaming ASR

Z Wu, G Song, C Li, P Rondon, Z Meng, X Velez… - arxiv preprint arxiv …, 2024 - arxiv.org

Contextual biasing enables speech recognizers to transcribe important phrases in the
speaker's context, such as contact names, even if they are rare in, or absent from, the …

Save Cite Cited by 2 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

A systematic comparison of grapheme-based vs. phoneme-based label units for encoder-decoder-attention models

M Zeineldeen, A Zeyer, W Zhou, T Ng… - arxiv preprint arxiv …, 2020 - arxiv.org

Following the rationale of end-to-end modeling, CTC, RNN-T or encoder-decoder-attention
models for automatic speech recognition (ASR) use graphemes or grapheme-based …

Save Cite Cited by 18 Related articles All 3 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

Phoneme-based contextualization for cross-lingual speech recognition in end-to-end models

Spoken instruction understanding in air traffic control: Challenge, technique, and application

From english to more languages: Parameter-efficient model reprogramming for cross-lingual speech recognition

Class LM and word map** for contextual biasing in end-to-end ASR

Adversarial meta sampling for multilingual low-resource speech recognition

ATCSpeechNet: A multilingual end-to-end speech recognition framework for air traffic control systems

The SLT 2021 children speech recognition challenge: Open datasets, rules and baselines

SeACo-Paraformer: A non-autoregressive ASR system with flexible and effective hotword customization ability

Differentiable allophone graphs for language-universal speech recognition

Deferred NAM: Low-latency Top-K Context Injection via DeferredContext Encoding for Non-Streaming ASR

A systematic comparison of grapheme-based vs. phoneme-based label units for encoder-decoder-attention models