A review of deep learning techniques for speech processing
The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …
learning. The use of multiple processing layers has enabled the creation of models capable …
[HTML][HTML] Few-shot learning for medical text: A review of advances, trends, and opportunities
Background: Few-shot learning (FSL) is a class of machine learning methods that require
small numbers of labeled instances for training. With many medical topics having limited …
small numbers of labeled instances for training. With many medical topics having limited …
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models
Language models demonstrate both quantitative improvement and new qualitative
capabilities with increasing scale. Despite their potentially transformative impact, these new …
capabilities with increasing scale. Despite their potentially transformative impact, these new …
Superb: Speech processing universal performance benchmark
Self-supervised learning (SSL) has proven vital for advancing research in natural language
processing (NLP) and computer vision (CV). The paradigm pretrains a shared model on …
processing (NLP) and computer vision (CV). The paradigm pretrains a shared model on …
Multi-task pre-training for plug-and-play task-oriented dialogue system
Pre-trained language models have been recently shown to benefit task-oriented dialogue
(TOD) systems. Despite their success, existing methods often formulate this task as a …
(TOD) systems. Despite their success, existing methods often formulate this task as a …
Data augmentation using pre-trained transformer models
Language model based pre-trained models such as BERT have provided significant gains
across different NLP tasks. In this paper, we study different types of transformer based pre …
across different NLP tasks. In this paper, we study different types of transformer based pre …
Bert for joint intent classification and slot filling
Intent classification and slot filling are two essential tasks for natural language
understanding. They often suffer from small-scale human-labeled training data, resulting in …
understanding. They often suffer from small-scale human-labeled training data, resulting in …
Efficient intent detection with dual sentence encoders
Building conversational systems in new domains and with added functionality requires
resource-efficient models that work under low-data regimes (ie, in few-shot setups) …
resource-efficient models that work under low-data regimes (ie, in few-shot setups) …
A fine-tuned wav2vec 2.0/hubert benchmark for speech emotion recognition, speaker verification and spoken language understanding
Speech self-supervised models such as wav2vec 2.0 and HuBERT are making revolutionary
progress in Automatic Speech Recognition (ASR). However, they have not been totally …
progress in Automatic Speech Recognition (ASR). However, they have not been totally …
SLURP: A spoken language understanding resource package
Spoken Language Understanding infers semantic meaning directly from audio data, and
thus promises to reduce error propagation and misunderstandings in end-user applications …
thus promises to reduce error propagation and misunderstandings in end-user applications …