[HTML][HTML] Unsupervised automatic speech recognition: A review
Abstract Automatic Speech Recognition (ASR) systems can be trained to achieve
remarkable performance given large amounts of manually transcribed speech, but large …
remarkable performance given large amounts of manually transcribed speech, but large …
Unsupervised neural network based feature extraction using weak top-down constraints
Deep neural networks (DNNs) have become a standard component in supervised ASR,
used in both data-driven feature extraction and acoustic modelling. Supervision is typically …
used in both data-driven feature extraction and acoustic modelling. Supervision is typically …
Spoken content retrieval—beyond cascading speech recognition with text retrieval
Spoken content retrieval refers to directly indexing and retrieving spoken content based on
the audio rather than text descriptions. This potentially eliminates the requirement of …
the audio rather than text descriptions. This potentially eliminates the requirement of …
Fixed-dimensional acoustic embeddings of variable-length segments in low-resource settings
Measures of acoustic similarity between words or other units are critical for segmental
exemplar-based acoustic models, spoken term discovery, and query-by-example search …
exemplar-based acoustic models, spoken term discovery, and query-by-example search …
Discriminative acoustic word embeddings: Tecurrent neural network-based approaches
Acoustic word embeddings-fixed-dimensional vector representations of variable-length
spoken word segments-have begun to be considered for tasks such as speech recognition …
spoken word segments-have begun to be considered for tasks such as speech recognition …
Multilingual representations for low resource speech recognition and keyword search
This paper examines the impact of multilingual (ML) acoustic representations on Automatic
Speech Recognition (ASR) and keyword search (KWS) for low resource languages in the …
Speech Recognition (ASR) and keyword search (KWS) for low resource languages in the …
Unsupervised word segmentation and lexicon discovery using acoustic word embeddings
In settings where only unlabeled speech data is available, speech technology needs to be
developed without transcriptions, pronunciation dictionaries, or language modelling text. A …
developed without transcriptions, pronunciation dictionaries, or language modelling text. A …
High-performance query-by-example spoken term detection on the SWS 2013 evaluation
In the last years, the task of Query-by-Example Spoken Term Detection (QbE-STD), which
aims to find occurrences of a spoken query in a set of audio documents, has gained the …
aims to find occurrences of a spoken query in a set of audio documents, has gained the …
Acoustic segment modeling with spectral clustering methods
This paper presents a study of spectral clustering-based approaches to acoustic segment
modeling (ASM). ASM aims at finding the underlying phoneme-like speech units and …
modeling (ASM). ASM aims at finding the underlying phoneme-like speech units and …
Query-by-example spoken term detection using frequency domain linear prediction and non-segmental dynamic time war**
G Mantena, S Achanta… - IEEE/ACM Transactions on …, 2014 - ieeexplore.ieee.org
The task of query-by-example spoken term detection (QbE-STD) is to find a spoken query
within spoken audio data. Current state-of-the-art techniques assume zero prior knowledge …
within spoken audio data. Current state-of-the-art techniques assume zero prior knowledge …