Visinger: Variational inference with adversarial learning for end-to-end singing voice synthesis

Y Zhang, J Cong, H Xue, L **e… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
In this paper, we propose VISinger, a complete end-to-end high-quality singing voice
synthesis (SVS) system that directly generates singing audio from lyrics and musical score …

Bytesing: A chinese singing voice synthesis system using duration allocated encoder-decoder acoustic models and wavernn vocoders

Y Gu, X Yin, Y Rao, Y Wan, B Tang… - … on Chinese Spoken …, 2021 - ieeexplore.ieee.org
This paper presents ByteSing, a Chinese singing voice synthesis (SVS) system based on
duration allocated Tacotron-like acoustic models and WaveRNN neural vocoders. Different …

Sinsy: A deep neural network-based singing voice synthesis system

Y Hono, K Hashimoto, K Oura… - … on Audio, Speech …, 2021 - ieeexplore.ieee.org
This paper presents Sinsy, a deep neural network (DNN)-based singing voice synthesis
(SVS) system. In recent years, DNNs have been utilized in statistical parametric SVS …

Muskits: an end-to-end music processing toolkit for singing voice synthesis

J Shi, S Guo, T Qian, N Huo, T Hayashi, Y Wu… - arxiv preprint arxiv …, 2022 - arxiv.org
This paper introduces a new open-source platform named Muskits for end-to-end music
processing, which mainly focuses on end-to-end singing voice synthesis (E2E-SVS). Muskits …

Visinger 2: High-fidelity end-to-end singing voice synthesis enhanced by digital signal processing synthesizer

Y Zhang, H Xue, H Li, L **e, T Guo, R Zhang… - arxiv preprint arxiv …, 2022 - arxiv.org
End-to-end singing voice synthesis (SVS) model VISinger can achieve better performance
than the typical two-stage model with fewer parameters. However, VISinger has several …

Tohoku kiritan singing database: A singing database for statistical parametric singing synthesis using japanese pop songs

I Ogawa, M Morise - Acoustical Science and Technology, 2021 - jstage.jst.go.jp
We have built a singing database that can be used for research purposes. Since recent
songs are protected by copyright law, researchers typically use songs that can be used …

Deep learning approaches in topics of singing information processing

C Gupta, H Li, M Goto - IEEE/ACM Transactions on Audio …, 2022 - ieeexplore.ieee.org
Singing, the vocal productionof musical tones, is one of the most important elements of
music. Addressing the needs of real-world applications, the study of technologies related to …

Singing voice synthesis using differentiable LPC and glottal-flow-inspired wavetables

CY Yu, G Fazekas - arxiv preprint arxiv:2306.17252, 2023 - arxiv.org
This paper introduces GlOttal-flow LPC Filter (GOLF), a novel method for singing voice
synthesis (SVS) that exploits the physical characteristics of the human voice using …

Sequence-to-sequence singing voice synthesis with perceptual entropy loss

J Shi, S Guo, N Huo, Y Zhang… - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
The neural network (NN) based singing voice synthesis (SVS) systems require sufficient
data to train well and are are prone to over-fitting due to data scarcity. However, we often …

Singaug: Data augmentation for singing voice synthesis with cycle-consistent training strategy

S Guo, J Shi, T Qian, S Watanabe, Q ** - arxiv preprint arxiv:2203.17001, 2022 - arxiv.org
Deep learning based singing voice synthesis (SVS) systems have been demonstrated to
flexibly generate singing with better qualities, compared to conventional statistical …