- Academic Search

S Lee, A Potamianos, S Narayanan - The Journal of the Acoustical …, 1999 - pubs.aip.org

Changes in magnitude and variability of duration, fundamental frequency, formant
frequencies, and spectral envelope of children's speech are investigated as a function of …

Save Cite Cited by 1192 Related articles All 12 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Fine-grained robust prosody transfer for single-speaker neural text-to-speech

V Klimkov, S Ronanki, J Rohnke, T Drugman - arxiv preprint arxiv …, 2019 - arxiv.org

We present a neural text-to-speech system for fine-grained prosody transfer from one
speaker to another. Conventional approaches for end-to-end prosody transfer typically use …

Save Cite Cited by 101 Related articles All 6 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PS] upenn.edu

[PS][PS] The Boston University radio news corpus

M Ostendorf, PJ Price… - Linguistic Data …, 1995 - catalog.ldc.upenn.edu

We describe a corpus of professionally read radio news data, including speech and
accompanying annotations, suitable for speech and language research. The corpus consists …

Save Cite Cited by 398 Related articles All 3 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] googleapis.com

System and method for time aligning speech

BJ Wheatley, CT Hemphill, TD Fisher… - US Patent …, 1994 - Google Patents

Donaldson 57 ABSTRACT A method and system are provided for time aligning speech.
Speech data is input representing speech signals from a speaker. An orthographic …

Save Cite Cited by 329 Related articles All 2 versions Free GPT-4 DeepSeek Cached

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Effectiveness of mining audio and text pairs from public data for improving ASR systems for low-resource languages

K Bhogale, A Raman, T Javed… - Icassp 2023-2023 …, 2023 - ieeexplore.ieee.org

Collecting labelled datasets for speech recognition systems for low-resource languages on
a diverse set of domains and speakers is expensive. In this work, we demonstrate an …

Save Cite Cited by 29 Related articles All 3 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] researchgate.net

Automatic segmentation and labeling of speech based on Hidden Markov Models

F Brugnara, D Falavigna, M Omologo - Speech Communication, 1993 - Elsevier

An accurate database documentation at phonetic level is very important for speech
research: however, manual segmentation and labeling is a time consuming and error prone …

Save Cite Cited by 291 Related articles All 8 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] usc.edu

[PDF][PDF] SailAlign: Robust long speech-text alignment

A Katsamanis, M Black, PG Georgiou… - Proc. of workshop on …, 2011 - sail.usc.edu

Long speech-text alignment can facilitate large-scale study of rich spoken language
resources that have recently become widely accessible, eg, collections of audio books, or …

Save Cite Cited by 160 Related articles All 5 versions Free GPT-4 DeepSeek View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] psu.edu

[PDF][PDF] Trainable speech synthesis

RE Donovan - 1996 - Citeseer

This thesis is concerned with the synthesis of speech using trainable systems. The research
it describes was conducted with two principle aims: to build a hidden Markov model (HMM) …

Save Cite Cited by 218 Related articles All 6 versions Free GPT-4 DeepSeek Library Search View as HTML

[Free GPT-4]
[DeepSeek]

[PDF] nih.gov

Speaker-independent phoneme alignment using transition-dependent states

JP Hosom - Speech communication, 2009 - Elsevier

Determining the location of phonemes is important to a number of speech applications,
including training of automatic speech recognition systems, building text-to-speech systems …

Save Cite Cited by 99 Related articles All 8 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] lehigh.edu

An experimental study of speaker verification sensitivity to computer voice-altered imposters

BL Pellom, JHL Hansen - 1999 IEEE International Conference …, 1999 - ieeexplore.ieee.org

This paper investigates the relative sensitivity of a Gaussian mixture model (GMM) based
voice verification algorithm to computer voice-altered imposters. First, a new trainable …

Save Cite Cited by 102 Related articles All 8 versions Free GPT-4 DeepSeek

Create alert

Cite

Advanced search

Saved to My library

Automatic segmentation and labeling of speech

Acoustics of children's speech: Developmental changes of temporal and spectral parameters

Fine-grained robust prosody transfer for single-speaker neural text-to-speech

[PS][PS] The Boston University radio news corpus

System and method for time aligning speech

Effectiveness of mining audio and text pairs from public data for improving ASR systems for low-resource languages

Automatic segmentation and labeling of speech based on Hidden Markov Models

[PDF][PDF] SailAlign: Robust long speech-text alignment

[PDF][PDF] Trainable speech synthesis

Speaker-independent phoneme alignment using transition-dependent states

An experimental study of speaker verification sensitivity to computer voice-altered imposters