- Academic Search

N Kaur, P Singh - Artificial Intelligence Review, 2023 - Springer

Nowadays speech synthesis or text to speech (TTS), an ability of system to produce human
like natural sounding voice from the written text, is gaining popularity in the field of speech …

Simpan Kutip Dirujuk 60 kali Artikel terkait 3 versi

[Free GPT-4]
[DeepSeek]

[PDF] academia.edu

Deep learning for acoustic modeling in parametric speech generation: A systematic review of existing techniques and future trends

ZH Ling, SY Kang, H Zen, A Senior… - IEEE Signal …, 2015 - ieeexplore.ieee.org

Hidden Markov models (HMMs) and Gaussian mixture models (GMMs) are the two most
common types of acoustic models used in statistical parametric approaches for generating …

Simpan Kutip Dirujuk 310 kali Artikel terkait 11 versi

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Deep voice 2: Multi-speaker neural text-to-speech

A Gibiansky, S Arik, G Diamos, J Miller… - Advances in neural …, 2017 - proceedings.neurips.cc

We introduce a technique for augmenting neural text-to-speech (TTS) with low-dimensional
trainable speaker embeddings to generate different voices from a single model. As a starting …

Simpan Kutip Dirujuk 448 kali Artikel terkait 8 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] researchgate.net

Statistical parametric speech synthesis using deep neural networks

H Zen, A Senior, M Schuster - 2013 ieee international …, 2013 - ieeexplore.ieee.org

Conventional approaches to statistical parametric speech synthesis typically use decision
tree-clustered context-dependent hidden Markov models (HMMs) to represent probability …

Simpan Kutip Dirujuk 1181 kali Artikel terkait 14 versi

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Visual to sound: Generating natural sound for videos in the wild

Y Zhou, Z Wang, C Fang, T Bui… - Proceedings of the …, 2018 - openaccess.thecvf.com

As two of the five traditional human senses (sight, hearing, taste, smell, and touch), vision
and sound are basic sources through which humans understand the world. Often correlated …

Simpan Kutip Dirujuk 249 kali Artikel terkait 11 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Deep voice 2: Multi-speaker neural text-to-speech

S Arik, G Diamos, A Gibiansky, J Miller, K Peng… - arxiv preprint arxiv …, 2017 - arxiv.org

We introduce a technique for augmenting neural text-to-speech (TTS) with lowdimensional
trainable speaker embeddings to generate different voices from a single model. As a starting …

Simpan Kutip Dirujuk 228 kali Artikel terkait 2 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] googleapis.com

Method and system for non-parametric voice conversion

I Agiomyrgiannakis - US Patent 9,183,830, 2015 - Google Patents

GIOL I5/04(2013.01) A method and system is disclosed for non-parametric speech GIOL
I5/4(2006.01) conversion. A text-to-speech (TTS) synthesis system may GIOL I3/02(2013.01) …

Simpan Kutip Dirujuk 263 kali Artikel terkait 4 versi Cache

[Free GPT-4]
[DeepSeek]

[PDF] isca-archive.org

[PDF][PDF] WaveNet Vocoder with Limited Training Data for Voice Conversion.

LJ Liu, ZH Ling, Y Jiang, M Zhou, LR Dai - Interspeech, 2018 - isca-archive.org

This paper investigates the approaches of building WaveNet vocoders with limited training
data for voice conversion (VC). Current VC systems using statistical acoustic models always …

Simpan Kutip Dirujuk 156 kali Artikel terkait 5 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] ed.ac.uk

Evaluation of speaker verification security and detection of HMM-based synthetic speech

PL De Leon, M Pucher, J Yamagishi… - … on Audio, Speech …, 2012 - ieeexplore.ieee.org

In this paper, we evaluate the vulnerability of speaker verification (SV) systems to synthetic
speech. The SV systems are based on either the Gaussian mixture model–universal …

Simpan Kutip Dirujuk 291 kali Artikel terkait 6 versi

[Free GPT-4]
[DeepSeek]

[PDF] googleapis.com

Method and system for building text-to-speech voice from diverse recordings

I Agiomyrgiannakis, A Gutkin - US Patent 9,542,927, 2017 - Google Patents

(57) ABSTRACT A method and system is disclosed for building a speech database for a text-
to-speech (TTS) synthesis system from multiple speakers recorded under diverse conditions …

Simpan Kutip Dirujuk 188 kali Artikel terkait 4 versi Cache

Buat notifikasi

Kutip

Penelusuran lanjutan

Disimpan ke Koleksi saya

Robust speaker-adaptive HMM-based text-to-speech synthesis

Conventional and contemporary approaches used in text to speech synthesis: A review

Deep learning for acoustic modeling in parametric speech generation: A systematic review of existing techniques and future trends

Deep voice 2: Multi-speaker neural text-to-speech

Statistical parametric speech synthesis using deep neural networks

Visual to sound: Generating natural sound for videos in the wild

Deep voice 2: Multi-speaker neural text-to-speech

Method and system for non-parametric voice conversion

[PDF][PDF] WaveNet Vocoder with Limited Training Data for Voice Conversion.

Evaluation of speaker verification security and detection of HMM-based synthetic speech

Method and system for building text-to-speech voice from diverse recordings