Vector-quantized neural networks for acoustic unit discovery in the zerospeech 2020 challenge

B Van Niekerk, L Nortje, H Kamper - arxiv preprint arxiv:2005.09409, 2020 - arxiv.org
In this paper, we explore vector quantization for acoustic unit discovery. Leveraging
unlabelled data, we aim to learn discrete representations of speech that separate phonetic …

Unsupervised acoustic unit discovery for speech synthesis using discrete latent-variable neural networks

R Eloff, A Nortje, B van Niekerk, A Govender… - arxiv preprint arxiv …, 2019 - arxiv.org
For our submission to the ZeroSpeech 2019 challenge, we apply discrete latent-variable
neural networks to unlabelled speech and use the discovered units for speech synthesis …

[HTML][HTML] CiwGAN and fiwGAN: Encoding information in acoustic data to model lexical learning with Generative Adversarial Networks

G Beguš - Neural Networks, 2021 - Elsevier
How can deep neural networks encode information that corresponds to words in human
speech into raw acoustic data? This paper proposes two neural network architectures for …

Generative adversarial phonology: Modeling unsupervised phonetic and phonological learning with neural networks

G Beguš - Frontiers in artificial intelligence, 2020 - frontiersin.org
Training deep neural networks on well-understood dependencies in speech data can
provide new insights into how they learn internal representations. This paper argues that …

Infant phonetic learning as perceptual space learning: A crosslinguistic evaluation of computational models

Y Matusevych, T Schatz, H Kamper… - Cognitive …, 2023 - Wiley Online Library
In the first year of life, infants' speech perception becomes attuned to the sounds of their
native language. This process of early phonetic learning has traditionally been framed as …

Identity-based patterns in deep convolutional networks: Generative adversarial phonology and reduplication

G Beguš - Transactions of the Association for Computational …, 2021 - direct.mit.edu
This paper models unsupervised learning of an identity-based pattern (or copying) in
speech called reduplication from raw continuous data with deep convolutional neural …

[HTML][HTML] Local and non-local dependency learning and emergence of rule-like representations in speech data by deep convolutional generative adversarial networks

G Beguš - Computer speech & language, 2022 - Elsevier
This paper argues that training Generative Adversarial Networks (GANs) on local and non-
local dependencies in speech data offers insights into how deep neural networks discretize …

Exploring how generative adversarial networks learn phonological representations

J Chen, M Elsner - arxiv preprint arxiv:2305.12501, 2023 - arxiv.org
This paper explores how Generative Adversarial Networks (GANs) learn representations of
phonological phenomena. We analyze how GANs encode contrastive and non-contrastive …

Interpreting intermediate convolutional layers of generative CNNs trained on waveforms

G Beguš, A Zhou - IEEE/ACM transactions on audio, speech …, 2022 - ieeexplore.ieee.org
This paper presents a technique to interpret and visualize intermediate layers in generative
CNNs trained on raw speech data in an unsupervised manner. We argue that averaging …

[PDF][PDF] Modeing unsupervised phonetic and phonological learning in Generative Adversarial Phonology

G Beguš - Proceedings of the Society for Computation in …, 2020 - aclanthology.org
This paper models phonetic and phonological learning as a dependency between random
space and generated speech data in the Generative Adversarial Neural network architecture …