Parp: Prune, adjust and re-prune for self-supervised speech recognition

CIJ Lai, Y Zhang, AH Liu, S Chang… - Advances in …, 2021‏ - proceedings.neurips.cc
Self-supervised speech representation learning (speech SSL) has demonstrated the benefit
of scale in learning rich representations for Automatic Speech Recognition (ASR) with …

How does pre-trained wav2vec 2.0 perform on domain-shifted asr? an extensive benchmark on air traffic control communications

J Zuluaga-Gomez, A Prasad… - 2022 IEEE spoken …, 2023‏ - ieeexplore.ieee.org
Recent work on self-supervised pre-training focus on leveraging large-scale unlabeled
speech data to build robust end-to-end (E2E) acoustic models (AM) that can be later fine …

Towards better domain adaptation for self-supervised models: A case study of child ASR

R Fan, Y Zhu, J Wang, A Alwan - IEEE Journal of Selected …, 2022‏ - ieeexplore.ieee.org
Recently, self-supervised learning (SSL) from unlabelled speech data has gained increased
attention in the automatic speech recognition (ASR) community. Typical SSL methods …

DRAFT: A novel framework to reduce domain shifting in self-supervised learning and its application to children's ASR

R Fan, A Alwan - arxiv preprint arxiv:2206.07931, 2022‏ - arxiv.org
Self-supervised learning (SSL) in the pretraining stage using un-annotated speech data has
been successful in low-resource automatic speech recognition (ASR) tasks. However …

Examining the interplay between privacy and fairness for speech processing: A review and perspective

A Leschanowsky, S Das - arxiv preprint arxiv:2408.15391, 2024‏ - arxiv.org
Speech technology has been increasingly deployed in various areas of daily life including
sensitive domains such as healthcare and law enforcement. For these technologies to be …

A study of gender impact in self-supervised models for speech-to-text systems

MZ Boito, L Besacier, N Tomashenko… - arxiv preprint arxiv …, 2022‏ - arxiv.org
Self-supervised models for speech processing emerged recently as popular foundation
blocks in speech processing pipelines. These models are pre-trained on unlabeled audio …

On the social bias of speech self-supervised models

YC Lin, TQ Lin, HC Lin, AT Liu, H Lee - arxiv preprint arxiv:2406.04997, 2024‏ - arxiv.org
Self-supervised learning (SSL) speech models have achieved remarkable performance in
various tasks, yet the biased outcomes, especially affecting marginalized groups, raise …

Pre-trained Speech Processing Models Contain Human-Like Biases that Propagate to Speech Emotion Recognition

I Slaughter, C Greenberg, R Schwartz… - arxiv preprint arxiv …, 2023‏ - arxiv.org
Previous work has established that a person's demographics and speech style affect how
well speech processing models perform for them. But where does this bias come from? In …

[HTML][HTML] Causal reasoning for algorithmic fairness in voice controlled cyber-physical systems

G Fenu, M Marras, G Medda, G Meloni - Pattern Recognition Letters, 2023‏ - Elsevier
Automated speaker recognition is enabling personalized interactions with the voice-based
interfaces and assistants part of the modern cyber-physical-social systems. Prior studies …

Self-supervised speech representations still struggle with african american vernacular english

K Chang, YH Chou, J Shi, HM Chen, N Holliday… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Underperformance of ASR systems for speakers of African American Vernacular English
(AAVE) and other marginalized language varieties is a well-documented phenomenon, and …