Spoken instruction understanding in air traffic control: Challenge, technique, and application
Y Lin - Aerospace, 2021 - mdpi.com
In air traffic control (ATC), speech communication with radio transmission is the primary way
to exchange information between the controller and aircrew. A wealth of contextual …
to exchange information between the controller and aircrew. A wealth of contextual …
From english to more languages: Parameter-efficient model reprogramming for cross-lingual speech recognition
In this work, we propose a new parameter-efficient learning framework based on neural
model reprogramming for cross-lingual speech recognition, which can re-purpose well …
model reprogramming for cross-lingual speech recognition, which can re-purpose well …
Class LM and word map** for contextual biasing in end-to-end ASR
In recent years, all-neural, end-to-end (E2E) ASR systems gained rapid interest in the
speech recognition community. They convert speech input to text units in a single trainable …
speech recognition community. They convert speech input to text units in a single trainable …
Adversarial meta sampling for multilingual low-resource speech recognition
Low-resource automatic speech recognition (ASR) is challenging, as the low-resource target
language data cannot well train an ASR model. To solve this issue, meta-learning …
language data cannot well train an ASR model. To solve this issue, meta-learning …
ATCSpeechNet: A multilingual end-to-end speech recognition framework for air traffic control systems
In this paper, a multilingual end-to-end framework, called ATCSpeechNet, is proposed to
tackle the issue of translating communication speech into human-readable text in air traffic …
tackle the issue of translating communication speech into human-readable text in air traffic …
The SLT 2021 children speech recognition challenge: Open datasets, rules and baselines
Automatic speech recognition (ASR) has been significantly advanced with the use of deep
learning and big data. How-ever improving robustness, including achieving equally good …
learning and big data. How-ever improving robustness, including achieving equally good …
SeACo-Paraformer: A non-autoregressive ASR system with flexible and effective hotword customization ability
Hotword customization is one of the concerned issues remained in ASR field-it is of value to
enable users of ASR systems to customize names of entities, persons and other phrases to …
enable users of ASR systems to customize names of entities, persons and other phrases to …
Differentiable allophone graphs for language-universal speech recognition
Building language-universal speech recognition systems entails producing phonological
units of spoken sound that can be shared across languages. While speech annotations at …
units of spoken sound that can be shared across languages. While speech annotations at …
Deferred NAM: Low-latency Top-K Context Injection via DeferredContext Encoding for Non-Streaming ASR
Contextual biasing enables speech recognizers to transcribe important phrases in the
speaker's context, such as contact names, even if they are rare in, or absent from, the …
speaker's context, such as contact names, even if they are rare in, or absent from, the …
A systematic comparison of grapheme-based vs. phoneme-based label units for encoder-decoder-attention models
Following the rationale of end-to-end modeling, CTC, RNN-T or encoder-decoder-attention
models for automatic speech recognition (ASR) use graphemes or grapheme-based …
models for automatic speech recognition (ASR) use graphemes or grapheme-based …