Acoustics of children's speech: Developmental changes of temporal and spectral parameters

S Lee, A Potamianos, S Narayanan - The Journal of the Acoustical …, 1999 - pubs.aip.org
Changes in magnitude and variability of duration, fundamental frequency, formant
frequencies, and spectral envelope of children's speech are investigated as a function of …

Fine-grained robust prosody transfer for single-speaker neural text-to-speech

V Klimkov, S Ronanki, J Rohnke, T Drugman - arxiv preprint arxiv …, 2019 - arxiv.org
We present a neural text-to-speech system for fine-grained prosody transfer from one
speaker to another. Conventional approaches for end-to-end prosody transfer typically use …

[PS][PS] The Boston University radio news corpus

M Ostendorf, PJ Price… - Linguistic Data …, 1995 - catalog.ldc.upenn.edu
We describe a corpus of professionally read radio news data, including speech and
accompanying annotations, suitable for speech and language research. The corpus consists …

System and method for time aligning speech

BJ Wheatley, CT Hemphill, TD Fisher… - US Patent …, 1994 - Google Patents
Donaldson 57 ABSTRACT A method and system are provided for time aligning speech.
Speech data is input representing speech signals from a speaker. An orthographic …

Effectiveness of mining audio and text pairs from public data for improving ASR systems for low-resource languages

K Bhogale, A Raman, T Javed… - Icassp 2023-2023 …, 2023 - ieeexplore.ieee.org
Collecting labelled datasets for speech recognition systems for low-resource languages on
a diverse set of domains and speakers is expensive. In this work, we demonstrate an …

Automatic segmentation and labeling of speech based on Hidden Markov Models

F Brugnara, D Falavigna, M Omologo - Speech Communication, 1993 - Elsevier
An accurate database documentation at phonetic level is very important for speech
research: however, manual segmentation and labeling is a time consuming and error prone …

[PDF][PDF] SailAlign: Robust long speech-text alignment

A Katsamanis, M Black, PG Georgiou… - Proc. of workshop on …, 2011 - sail.usc.edu
Long speech-text alignment can facilitate large-scale study of rich spoken language
resources that have recently become widely accessible, eg, collections of audio books, or …

[PDF][PDF] Trainable speech synthesis

RE Donovan - 1996 - Citeseer
This thesis is concerned with the synthesis of speech using trainable systems. The research
it describes was conducted with two principle aims: to build a hidden Markov model (HMM) …

Speaker-independent phoneme alignment using transition-dependent states

JP Hosom - Speech communication, 2009 - Elsevier
Determining the location of phonemes is important to a number of speech applications,
including training of automatic speech recognition systems, building text-to-speech systems …

An experimental study of speaker verification sensitivity to computer voice-altered imposters

BL Pellom, JHL Hansen - 1999 IEEE International Conference …, 1999 - ieeexplore.ieee.org
This paper investigates the relative sensitivity of a Gaussian mixture model (GMM) based
voice verification algorithm to computer voice-altered imposters. First, a new trainable …