An overview of voice conversion and its challenges: From statistical modeling to deep learning
Speaker identity is one of the important characteristics of human speech. In voice
conversion, we change the speaker identity from one to another, while kee** the linguistic …
conversion, we change the speaker identity from one to another, while kee** the linguistic …
Expressive TTS training with frame and style reconstruction loss
We propose a novel training strategy for Tacotron-based text-to-speech (TTS) system that
improves the speech styling at utterance level. One of the key challenges in prosody …
improves the speech styling at utterance level. One of the key challenges in prosody …
Deep learning approaches in topics of singing information processing
Singing, the vocal productionof musical tones, is one of the most important elements of
music. Addressing the needs of real-world applications, the study of technologies related to …
music. Addressing the needs of real-world applications, the study of technologies related to …
Elucidate gender fairness in singing voice transcription
It is widely known that males and females typically possess different sound characteristics
when singing, such as timbre and pitch, but it has never been explored whether these …
when singing, such as timbre and pitch, but it has never been explored whether these …
SLIONS: A karaoke application to enhance foreign language learning
Singing songs can be an engaging and effective activity when learning a foreign language.
In this paper, we describe a multi-language karaoke application called SLIONS: Singing and …
In this paper, we describe a multi-language karaoke application called SLIONS: Singing and …
Analysis and modeling of timbre perception features in musical sounds
W Jiang, J Liu, X Zhang, S Wang, Y Jiang - Applied Sciences, 2020 - mdpi.com
A novel technique is proposed for the analysis and modeling of timbre perception features,
including a new terminology system for evaluating timbre in musical instruments. This …
including a new terminology system for evaluating timbre in musical instruments. This …
[PDF][PDF] Automatic Pronunciation Evaluation of Singing.
In this work, we develop a strategy to automatically evaluate pronunciation of singing. We
apply singing-adapted automatic speech recognizer (ASR) in a two-stage approach for …
apply singing-adapted automatic speech recognizer (ASR) in a two-stage approach for …
Perception-aware attack: Creating adversarial music via reverse-engineering human perception
Previous adversarial audio attacks have mainly focused on ensuring the effectiveness of
attacking an audio signal classifier via creating a small noise-like perturbation on the …
attacking an audio signal classifier via creating a small noise-like perturbation on the …
[PDF][PDF] Wavelet Analysis of Speaker Dependent and Independent Prosody for Voice Conversion.
Thus far, voice conversion studies are mainly focused on the conversion of spectrum.
However, speaker identity is also characterized by its prosody features, such as fundamental …
However, speaker identity is also characterized by its prosody features, such as fundamental …
Speech-to-singing voice conversion: The challenges and strategies for improving vocal conversion processes
Speech-to-singing (STS) conversion is the task of converting the read lyrics of a song,
spoken in natural manner, to proper singing. The most important aspect of the task is to …
spoken in natural manner, to proper singing. The most important aspect of the task is to …