Self-supervised speech representation learning: A review
Although supervised deep learning has revolutionized speech and audio processing, it has
necessitated the building of specialist models for individual tasks and application scenarios …
necessitated the building of specialist models for individual tasks and application scenarios …
Contrastive self-supervised learning: review, progress, challenges and future research directions
In the last decade, deep supervised learning has had tremendous success. However, its
flaws, such as its dependency on manual and costly annotations on large datasets and …
flaws, such as its dependency on manual and costly annotations on large datasets and …
Unsupervised pretraining transfers well across languages
Cross-lingual and multi-lingual training of Automatic Speech Recognition (ASR) has been
extensively investigated in the supervised setting. This assumes the existence of a parallel …
extensively investigated in the supervised setting. This assumes the existence of a parallel …
Libri-light: A benchmark for asr with limited or no supervision
We introduce a new collection of spoken English audio suitable for training speech
recognition systems under limited or no supervision. It is derived from open-source audio …
recognition systems under limited or no supervision. It is derived from open-source audio …
Data augmenting contrastive learning of speech representations in the time domain
Contrastive Predictive Coding (CPC), based on predicting future segments of speech from
past segments is emerging as a powerful algorithm for representation learning of speech …
past segments is emerging as a powerful algorithm for representation learning of speech …
The zero resource speech challenge 2017
We describe a new challenge aimed at discovering subword and word units from raw
speech. This challenge is the followup to the Zero Resource Speech Challenge 2015. It …
speech. This challenge is the followup to the Zero Resource Speech Challenge 2015. It …
The zero resource speech challenge 2019: TTS without T
We present the Zero Resource Speech Challenge 2019, which proposes to build a speech
synthesizer without any text or phonetic labels: hence, TTS without T (text-to-speech without …
synthesizer without any text or phonetic labels: hence, TTS without T (text-to-speech without …
[HTML][HTML] Unsupervised automatic speech recognition: A review
Abstract Automatic Speech Recognition (ASR) systems can be trained to achieve
remarkable performance given large amounts of manually transcribed speech, but large …
remarkable performance given large amounts of manually transcribed speech, but large …
Self-supervised language learning from raw audio: Lessons from the zero resource speech challenge
E Dunbar, N Hamilakis… - IEEE Journal of Selected …, 2022 - ieeexplore.ieee.org
Recent progress in self-supervised or unsupervised machine learning has opened the
possibility of building a full speech processing system from raw audio without using any …
possibility of building a full speech processing system from raw audio without using any …
VQVAE unsupervised unit discovery and multi-scale code2spec inverter for zerospeech challenge 2019
We describe our submitted system for the ZeroSpeech Challenge 2019. The current
challenge theme addresses the difficulty of constructing a speech synthesizer without any …
challenge theme addresses the difficulty of constructing a speech synthesizer without any …