Self-supervised speech representation learning: A review

A Mohamed, H Lee, L Borgholt… - IEEE Journal of …, 2022 - ieeexplore.ieee.org
Although supervised deep learning has revolutionized speech and audio processing, it has
necessitated the building of specialist models for individual tasks and application scenarios …

Contrastive self-supervised learning: review, progress, challenges and future research directions

P Kumar, P Rawat, S Chauhan - International Journal of Multimedia …, 2022 - Springer
In the last decade, deep supervised learning has had tremendous success. However, its
flaws, such as its dependency on manual and costly annotations on large datasets and …

Unsupervised pretraining transfers well across languages

M Riviere, A Joulin, PE Mazaré… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org
Cross-lingual and multi-lingual training of Automatic Speech Recognition (ASR) has been
extensively investigated in the supervised setting. This assumes the existence of a parallel …

Libri-light: A benchmark for asr with limited or no supervision

J Kahn, M Riviere, W Zheng… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org
We introduce a new collection of spoken English audio suitable for training speech
recognition systems under limited or no supervision. It is derived from open-source audio …

Data augmenting contrastive learning of speech representations in the time domain

E Kharitonov, M Rivière, G Synnaeve… - 2021 IEEE Spoken …, 2021 - ieeexplore.ieee.org
Contrastive Predictive Coding (CPC), based on predicting future segments of speech from
past segments is emerging as a powerful algorithm for representation learning of speech …

The zero resource speech challenge 2017

E Dunbar, XN Cao, J Benjumea… - 2017 IEEE Automatic …, 2017 - ieeexplore.ieee.org
We describe a new challenge aimed at discovering subword and word units from raw
speech. This challenge is the followup to the Zero Resource Speech Challenge 2015. It …

The zero resource speech challenge 2019: TTS without T

E Dunbar, R Algayres, J Karadayi, M Bernard… - arxiv preprint arxiv …, 2019 - arxiv.org
We present the Zero Resource Speech Challenge 2019, which proposes to build a speech
synthesizer without any text or phonetic labels: hence, TTS without T (text-to-speech without …

[HTML][HTML] Unsupervised automatic speech recognition: A review

H Aldarmaki, A Ullah, S Ram, N Zaki - Speech Communication, 2022 - Elsevier
Abstract Automatic Speech Recognition (ASR) systems can be trained to achieve
remarkable performance given large amounts of manually transcribed speech, but large …

Self-supervised language learning from raw audio: Lessons from the zero resource speech challenge

E Dunbar, N Hamilakis… - IEEE Journal of Selected …, 2022 - ieeexplore.ieee.org
Recent progress in self-supervised or unsupervised machine learning has opened the
possibility of building a full speech processing system from raw audio without using any …

VQVAE unsupervised unit discovery and multi-scale code2spec inverter for zerospeech challenge 2019

A Tjandra, B Sisman, M Zhang, S Sakti, H Li… - arxiv preprint arxiv …, 2019 - arxiv.org
We describe our submitted system for the ZeroSpeech Challenge 2019. The current
challenge theme addresses the difficulty of constructing a speech synthesizer without any …