Glottal source processing: From analysis to applications

T Drugman, P Alku, A Alwan… - Computer Speech & …, 2014 - Elsevier
The great majority of current voice technology applications rely on acoustic features, such as
the widely used MFCC or LP parameters, which characterize the vocal tract response …

Time-frequency processing of nonstationary signals: Advanced TFD design to aid diagnosis with highlights from medical applications

B Boashash, G Azemi… - IEEE signal processing …, 2013 - ieeexplore.ieee.org
This article presents a methodical approach for improving quadratic time-frequency
distribution (QTFD) methods by designing adapted time-frequency (TF) kernels for diagnosis …

Lipsound2: Self-supervised pre-training for lip-to-speech reconstruction and lip reading

L Qu, C Weber, S Wermter - IEEE transactions on neural …, 2022 - ieeexplore.ieee.org
The aim of this work is to investigate the impact of crossmodal self-supervised pre-training
for speech reconstruction (video-to-audio) by leveraging the natural co-occurrence of audio …

[HTML][HTML] Discriminative segmental cues to vowel height and consonantal place and voicing in whispered speech

LMT Jesus, S Castilho, A Ferreira, MC Costa - Journal of Phonetics, 2023 - Elsevier
Purpose The acoustic signal attributes of whispered speech potentially carry sufficiently
distinct information to define vowel spaces and to disambiguate consonant place and …

Voice conversion for whispered speech synthesis

M Cotescu, T Drugman, G Huybrechts… - IEEE Signal …, 2019 - ieeexplore.ieee.org
We present an approach to synthesize whisper by applying a handcrafted signal processing
recipe and Voice Conversion (VC) techniques to convert normally phonated speech to …

Alaryngeal speech enhancement based on one-to-many eigenvoice conversion

H Doi, T Toda, K Nakamura… - … ACM transactions on …, 2013 - ieeexplore.ieee.org
In this paper, we present novel speaking-aid systems based on one-to-many eigenvoice
conversion (EVC) to enhance three types of alaryngeal speech: esophageal speech …

A comprehensive vowel space for whispered speech

HR Sharifzadeh, IV McLoughlin, MJ Russell - Journal of voice, 2012 - Elsevier
Whispered speech is a relatively common form of communications, used primarily to
selectively exclude or include potential listeners from hearing a spoken message. Despite …

[PDF][PDF] Whispered Speech to Neutral Speech Conversion Using Bidirectional LSTMs.

GN Meenakshi, PK Ghosh - Interspeech, 2018 - isca-archive.org
We propose a bidirectional long short-term memory (BLSTM) based whispered speech to
neutral speech conversion system that employs the STRAIGHT speech synthesizer. We use …

Fusion of auditory inspired amplitude modulation spectrum and cepstral features for whispered and normal speech speaker verification

M Sarria-Paja, TH Falk - Computer Speech & Language, 2017 - Elsevier
Whispered speech is a natural speaking style that despite its reduced perceptibility, still
contains relevant information regarding the intended message (ie, intelligibility), as well as …

Whispered-to-voiced alaryngeal speech conversion with generative adversarial networks

S Pascual, A Bonafonte, J Serrà… - arxiv preprint arxiv …, 2018 - arxiv.org
Most methods of voice restoration for patients suffering from aphonia either produce
whispered or monotone speech. Apart from intelligibility, this type of speech lacks …