A survey of deep network techniques all classifiers can adopt
Deep neural networks (DNNs) have introduced novel and useful tools to the machine
learning community. Other types of classifiers can potentially make use of these tools as well …
learning community. Other types of classifiers can potentially make use of these tools as well …
Multimodal and multilingual embeddings for large-scale speech mining
We present an approach to encode a speech signal into a fixed-size representation which
minimizes the cosine loss with the existing massively multilingual LASER text embedding …
minimizes the cosine loss with the existing massively multilingual LASER text embedding …
Do infants really learn phonetic categories?
Early changes in infants' ability to perceive native and nonnative speech sound contrasts are
typically attributed to their develo** knowledge of phonetic categories. We critically …
typically attributed to their develo** knowledge of phonetic categories. We critically …
DP-Parse: Finding word boundaries from raw speech with an instance lexicon
R Algayres, T Ricoul, J Karadayi… - Transactions of the …, 2022 - direct.mit.edu
Finding word boundaries in continuous speech is challenging as there is little or no
equivalent of a 'space'delimiter between words. Popular Bayesian non-parametric models …
equivalent of a 'space'delimiter between words. Popular Bayesian non-parametric models …
Feature learning for efficient ASR-free keyword spotting in low-resource languages
We consider feature learning for a computationally efficient method of keyword spotting that
can be applied in severely under-resourced settings. The objective is to support …
can be applied in severely under-resourced settings. The objective is to support …
Siamese capsule network for end-to-end speaker recognition in the wild
We propose an end-to-end deep model for speaker verification in the wild. Our model uses
thin-ResNet for extracting speaker embeddings from utterances and a Siamese capsule …
thin-ResNet for extracting speaker embeddings from utterances and a Siamese capsule …
Improved acoustic word embeddings for zero-resource languages using multilingual transfer
Acoustic word embeddings are fixed-dimensional representations of variable-length speech
segments. Such embeddings can form the basis for speech search, indexing and discovery …
segments. Such embeddings can form the basis for speech search, indexing and discovery …