- Academic Search

X Tan, T Qin, F Soong, TY Liu - arxiv preprint arxiv:2106.15561, 2021 - arxiv.org

Text to speech (TTS), or speech synthesis, which aims to synthesize intelligible and natural
speech given text, is a hot research topic in speech, language, and machine learning …

Save Cite Cited by 464 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] aaai.org

Diffsinger: Singing voice synthesis via shallow diffusion mechanism

J Liu, C Li, Y Ren, F Chen, Z Zhao - … of the AAAI conference on artificial …, 2022 - ojs.aaai.org

Singing voice synthesis (SVS) systems are built to synthesize high-quality and expressive
singing voice, in which the acoustic model generates the acoustic features (eg, mel …

Save Cite Cited by 282 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Foundation models for music: A survey

Y Ma, A Øland, A Ragni, BMS Del Sette, C Saitis… - arxiv preprint arxiv …, 2024 - arxiv.org

In recent years, foundation models (FMs) such as large language models (LLMs) and latent
diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This …

Save Cite Cited by 12 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

M4singer: A multi-style, multi-singer and musical score provided mandarin singing corpus

L Zhang, R Li, S Wang, L Deng, J Liu… - Advances in …, 2022 - proceedings.neurips.cc

The lack of publicly available high-quality and accurately labeled datasets has long been a
major bottleneck for singing voice synthesis (SVS). To tackle this problem, we present …

Save Cite Cited by 78 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] frontiersin.org

A review of differentiable digital signal processing for music and speech synthesis

B Hayes, J Shier, G Fazekas, A McPherson… - Frontiers in Signal …, 2024 - frontiersin.org

The term “differentiable digital signal processing” describes a family of techniques in which
loss function gradients are backpropagated through digital signal processors, facilitating …

Save Cite Cited by 27 Related articles All 6 versions Free GPT-4 Cached

[Free GPT-4]

[PDF] arxiv.org

Mega-tts: Zero-shot text-to-speech at scale with intrinsic inductive bias

Z Jiang, Y Ren, Z Ye, J Liu, C Zhang, Q Yang… - arxiv preprint arxiv …, 2023 - arxiv.org

Scaling text-to-speech to a large and wild dataset has been proven to be highly effective in
achieving timbre and speech style generalization, particularly in zero-shot TTS. However …

Save Cite Cited by 75 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

The singing voice conversion challenge 2023

WC Huang, LP Violeta, S Liu, J Shi… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org

We present the latest iteration of the voice conversion challenge (VCC) series, a bi-annual
scientific event aiming to compare and understand different voice conversion (VC) systems …

Save Cite Cited by 58 Related articles All 4 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Multi-singer: Fast multi-singer singing voice vocoder with a large-scale corpus

R Huang, F Chen, Y Ren, J Liu, C Cui… - Proceedings of the 29th …, 2021 - dl.acm.org

High-fidelity multi-singer singing voice synthesis is challenging for neural vocoder due to the
singing voice data shortage, limited singer generalization, and large computational cost …

Save Cite Cited by 105 Related articles All 4 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Opencpop: A high-quality open source chinese popular song corpus for singing voice synthesis

Y Wang, X Wang, P Zhu, J Wu, H Li, H Xue… - arxiv preprint arxiv …, 2022 - arxiv.org

This paper introduces Opencpop, a publicly available high-quality Mandarin singing corpus
designed for singing voice synthesis (SVS). The corpus consists of 100 popular Mandarin …

Save Cite Cited by 101 Related articles All 5 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Visinger: Variational inference with adversarial learning for end-to-end singing voice synthesis

Y Zhang, J Cong, H Xue, L **e… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

In this paper, we propose VISinger, a complete end-to-end high-quality singing voice
synthesis (SVS) system that directly generates singing audio from lyrics and musical score …

Save Cite Cited by 85 Related articles All 4 versions Free GPT-4

Create alert

Cite

Advanced search

Saved to My library

Hifisinger: Towards high-fidelity neural singing voice synthesis

A survey on neural speech synthesis

Diffsinger: Singing voice synthesis via shallow diffusion mechanism

Foundation models for music: A survey

M4singer: A multi-style, multi-singer and musical score provided mandarin singing corpus

A review of differentiable digital signal processing for music and speech synthesis

Mega-tts: Zero-shot text-to-speech at scale with intrinsic inductive bias

The singing voice conversion challenge 2023

Multi-singer: Fast multi-singer singing voice vocoder with a large-scale corpus

Opencpop: A high-quality open source chinese popular song corpus for singing voice synthesis

Visinger: Variational inference with adversarial learning for end-to-end singing voice synthesis