Open-source conversational ai with speechbrain 1.0

M Ravanelli, T Parcollet, A Moumen… - Journal of Machine …, 2024 - jmlr.org
SpeechBrain is an open-source Conversational AI toolkit based on PyTorch, focused
particularly on speech processing tasks such as speech recognition, speech enhancement …

A suite for acoustic language model evaluation

G Maimon, A Roth, Y Adi - arxiv preprint arxiv:2409.07437, 2024 - arxiv.org
Speech language models have recently demonstrated great potential as universal speech
processing systems. Such models have the ability to model the rich acoustic information …

TSELM: Target Speaker Extraction using Discrete Tokens and Language Models

B Tang, B Zeng, M Li - arxiv preprint arxiv:2409.07841, 2024 - arxiv.org
We propose TSELM, a novel target speaker extraction network that leverages discrete
tokens and language models. TSELM utilizes multiple discretized layers from WavLM as …

Do Discrete Self-Supervised Representations of Speech Capture Tone Distinctions?

O Osakuade, S King - arxiv preprint arxiv:2410.19935, 2024 - arxiv.org
Discrete representations of speech, obtained from Self-Supervised Learning (SSL)
foundation models, are widely used, especially where there are limited data for the …