A review of deep learning techniques for speech processing

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier
The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

Styletts-vc: One-shot voice conversion by knowledge transfer from style-based tts models

YA Li, C Han, N Mesgarani - 2022 IEEE Spoken Language …, 2023 - ieeexplore.ieee.org
One-shot voice conversion (VC) aims to convert speech from any source speaker to an
arbitrary target speaker with only a few seconds of reference speech from the target speaker …

Zero-shot voice conversion based on feature disentanglement

N Guo, J Wei, Y Li, W Lu, J Tao - Speech Communication, 2024 - Elsevier
Voice conversion (VC) aims to convert the voice from a source speaker to a target speaker
without modifying the linguistic content. Zero-shot voice conversion has attracted significant …

Robust speaker personalisation using generalized low-rank adaptation for automatic speech recognition

A Baby, G Joseph, S Singh - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org
For voice assistant systems, personalizing automated speech recognition (ASR) to a
customer is the proverbial holy grail. Careful selection of hyper-parameters will be …