Unsupervised speech representation learning using wavenet autoencoders

J Chorowski, RJ Weiss, S Bengio… - … /ACM transactions on …, 2019 - ieeexplore.ieee.org
We consider the task of unsupervised extraction of meaningful latent representations of
speech by applying autoencoding neural networks to speech waveforms. The goal is to …

The zero resource speech challenge 2017

E Dunbar, XN Cao, J Benjumea… - 2017 IEEE Automatic …, 2017 - ieeexplore.ieee.org
We describe a new challenge aimed at discovering subword and word units from raw
speech. This challenge is the followup to the Zero Resource Speech Challenge 2015. It …

[PDF][PDF] The zero resource speech challenge 2015.

M Versteegh, R Thiolliere, T Schatz, XN Cao… - Interspeech, 2015 - isca-archive.org
Abstract The Interspeech 2015 Zero Resource Speech Challenge aims at discovering
subword and word units from raw speech. The challenge provides the first unified and open …

Self-supervised language learning from raw audio: Lessons from the zero resource speech challenge

E Dunbar, N Hamilakis… - IEEE Journal of Selected …, 2022 - ieeexplore.ieee.org
Recent progress in self-supervised or unsupervised machine learning has opened the
possibility of building a full speech processing system from raw audio without using any …

A brief overview of unsupervised neural speech representation learning

L Borgholt, JD Havtorn, J Edin, L Maaløe… - arxiv preprint arxiv …, 2022 - arxiv.org
Unsupervised representation learning for speech processing has matured greatly in the last
few years. Work in computer vision and natural language processing has paved the way, but …

[PDF][PDF] A comparison of neural network methods for unsupervised representation learning on the zero resource speech challenge

D Renshaw, H Kamper, A Jansen… - … Annual Conference of …, 2015 - kamperh.com
The success of supervised deep neural networks (DNNs) in speech recognition cannot be
transferred to zero-resource languages where the requisite transcriptions are unavailable …

[PDF][PDF] Parallel inference of dirichlet process Gaussian mixture models for unsupervised acoustic modeling: a feasibility study.

H Chen, CC Leung, L **e, B Ma, H Li - INTERSPEECH, 2015 - isca-archive.org
We adopt a Dirichlet process Gaussian mixture model (DPGMM) for unsupervised acoustic
modeling and represent speech frames with Gaussian posteriorgrams. The model performs …

The zero resource speech challenge 2015: Proposed approaches and results

M Versteegh, X Anguera, A Jansen… - Procedia Computer …, 2016 - Elsevier
This paper reports on the results of the Zero Resource Speech Challenge 2015, the first
unified benchmark for zero resource speech technology, which aims at the unsupervised …

[PDF][PDF] Joint learning of speaker and phonetic similarities with siamese networks.

N Zeghidour, G Synnaeve, N Usunier, E Dupoux - INTERSPEECH, 2016 - isca-archive.org
Recent work has demonstrated, on small datasets, the feasibility of jointly learning
specialized speaker and phone embeddings, in a weakly supervised siamese DNN …

A deep scattering spectrum—deep siamese network pipeline for unsupervised acoustic modeling

N Zeghidour, G Synnaeve… - … on Acoustics, Speech …, 2016 - ieeexplore.ieee.org
Recent work has explored deep architectures for learning acoustic features in an
unsupervised or weakly-supervised way for phone recognition. Here we investigate the role …