A virtual simulation-pilot agent for training of air traffic controllers
In this paper we propose a novel virtual simulation-pilot engine for speeding up air traffic
controller (ATCo) training by integrating different state-of-the-art artificial intelligence (AI) …
controller (ATCo) training by integrating different state-of-the-art artificial intelligence (AI) …
Automatic speech recognition benchmark for air-traffic communications
Advances in Automatic Speech Recognition (ASR) over the last decade opened new areas
of speech-based automation such as in Air-Traffic Control (ATC) environment. Currently …
of speech-based automation such as in Air-Traffic Control (ATC) environment. Currently …
Are disentangled representations all you need to build speaker anonymization systems?
Speech signals contain a lot of sensitive information, such as the speaker's identity, which
raises privacy concerns when speech data get collected. Speaker anonymization aims to …
raises privacy concerns when speech data get collected. Speaker anonymization aims to …
Comparing CTC and LFMMI for out-of-domain adaptation of wav2vec 2.0 acoustic model
In this work, we investigate if the wav2vec 2.0 self-supervised pretraining helps mitigate the
overfitting issues with connectionist temporal classification (CTC) training to reduce its …
overfitting issues with connectionist temporal classification (CTC) training to reduce its …
Lattice-free MMI adaptation of self-supervised pretrained acoustic models
In this work, we propose lattice-free MMI (LFMMI) for supervised adaptation of self-
supervised pretrained acoustic model. We pretrain a Transformer model on thousand hours …
supervised pretrained acoustic model. We pretrain a Transformer model on thousand hours …
Parameter-Efficient Tuning with Adaptive Bottlenecks for Automatic Speech Recognition
G Vanderreydt, A Prasad, D Khalil… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org
Transfer learning from large multilingual pretrained models, like XLSR, has become the new
paradigm for Automatic Speech Recognition (ASR). Considering their ever-increasing size …
paradigm for Automatic Speech Recognition (ASR). Considering their ever-increasing size …
Anonymizing speech: Evaluating and designing speaker anonymization techniques
P Champion - arxiv preprint arxiv:2308.04455, 2023 - arxiv.org
The growing use of voice user interfaces has led to a surge in the collection and storage of
speech data. While data collection allows for the development of efficient tools powering …
speech data. While data collection allows for the development of efficient tools powering …
Fine-Tuning Self-Supervised Models for Language Identification Using Orthonormal Constraint
Self-supervised models trained with high linguistic diversity, such as the XLS-R model, can
be effectively fine-tuned for the language recognition task. Typically, a back-end classifier …
be effectively fine-tuned for the language recognition task. Typically, a back-end classifier …
Effectiveness of text, acoustic, and lattice-based representations in spoken language understanding tasks
In this paper, we perform an exhaustive evaluation of different representations to address
the intent classification problem in a Spoken Language Understanding (SLU) setup. We …
the intent classification problem in a Spoken Language Understanding (SLU) setup. We …
[PDF][PDF] Multitask Adaptation with Lattice-Free MMI for Multi-Genre Speech Recognition of Low Resource Languages.
In this paper, we develop Automatic Speech Recognition (ASR) systems for multi-genre
speech recognition of low-resource languages where training data is predominantly …
speech recognition of low-resource languages where training data is predominantly …