- Academic Search

W Guan, Y Li, T Li, H Huang, F Wang, J Lin… - Proceedings of the …, 2024 - ojs.aaai.org

The style transfer task in Text-to-Speech (TTS) refers to the process of transferring style
information into text content to generate corresponding speech with a specific style …

Salva Cita Citato da 12 Articoli correlati Tutte e 3 le versioni Versione HTML

[Free GPT-4]

[PDF] arxiv.org

Robust AI-Synthesized Speech Detection Using Feature Decomposition Learning and Synthesizer Feature Augmentation

K Zhang, Z Hua, Y Zhang, Y Guo… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

AI-synthesized speech, also known as deepfake speech, has recently raised significant
concerns due to the rapid advancement of speech synthesis and speech conversion …

Salva Cita Articoli correlati Tutte e 4 le versioni

[Free GPT-4]

[PDF] aaai.org

FT-GAN: Fine-Grained Tune Modeling for Chinese Opera Synthesis

M Zheng, P Bai, X Shi, X Zhou, Y Yan - Proceedings of the AAAI …, 2024 - ojs.aaai.org

Although singing voice synthesis (SVS) has made significant progress recently, with its
unique styles and various genres, Chinese opera synthesis requires greater attention but is …

Salva Cita Citato da 3 Articoli correlati Versione HTML

[Free GPT-4]

[PDF] arxiv.org

An end-to-end approach for chord-conditioned song generation

S Gao, S Lei, F Zhuo, H Liu, F Liu, B Tang… - arxiv preprint arxiv …, 2024 - arxiv.org

The Song Generation task aims to synthesize music composed of vocals and
accompaniment from given lyrics. While the existing method, Jukebox, has explored this …

Salva Cita Citato da 1 Articoli correlati Tutte e 3 le versioni Versione HTML

Hybrid Learning Module-Based Transformer for Multitrack Music Generation With Music Theory

Y Tie, X Guo, D Zhang, J Tie, L Qi… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

In recent years, multitrack music generation has garnered significant attention in both
academic and industrial spheres for its versatile utilization of various instruments in …

Salva Cita Articoli correlati

[Free GPT-4]

[PDF] isca-archive.org

[PDF][PDF] Challenge of Singing Voice Synthesis Using Only Text-To-Speech Corpus With FIRNet Source-Filter Neural Vocoder

T Okamoto, Y Ohtani, S Shimizu, T Toda… - Proc. Interspeech …, 2024 - isca-archive.org

Singing voice synthesis (SVS) corpora are more costly to collect than TTS corpora. SVS
using only a TTS corpus is challenging because the ranges of fundamental frequency (fo) …

Salva Cita Articoli correlati Tutte e 2 le versioni Versione HTML

LNACont: Language-Normalized Affine Coupling Layer with Contrastive Learning for Cross-Lingual Multi-Speaker Text-to-Speech

S Hwang, C Kim - 2024 32nd European Signal Processing …, 2024 - ieeexplore.ieee.org

The current advancement in text-to-speech (TTS) has achieved a commendable level of
reproducing human-like voices, including diverse speaking style such as multiple speaker …

Salva Cita Articoli correlati

[Free GPT-4]

[PDF] arxiv.org

LHQ-SVC: Lightweight and High Quality Singing Voice Conversion Modeling

Y Huang, X Lai, M Ye, A Zhu, Z Wang, J Xu… - arxiv preprint arxiv …, 2024 - arxiv.org

Singing Voice Conversion (SVC) has emerged as a significant subfield of Voice Conversion
(VC), enabling the transformation of one singer's voice into another while preserving musical …

Salva Cita Articoli correlati Versione HTML

[Free GPT-4]

[PDF] openreview.net

A Dual-branch Multi-Band Neural Vocoder with Harmonic Discriminator for High-Fidelity Speech Synthesis

N Xu, H Liu - openreview.net

Recent developments in vocoders are primarily dominated by GAN-based networks
targeting to high-quality waveform generation from mel-spectrogram representations …

Salva Cita Articoli correlati Versione HTML

Crea avviso

Cita

Ricerca avanzata

Salvato in La mia biblioteca

UniSyn: an end-to-end unified model for text-to-speech and singing voice synthesis

MM-TTS: Multi-Modal Prompt Based Style Transfer for Expressive Text-to-Speech Synthesis

Robust AI-Synthesized Speech Detection Using Feature Decomposition Learning and Synthesizer Feature Augmentation

FT-GAN: Fine-Grained Tune Modeling for Chinese Opera Synthesis

An end-to-end approach for chord-conditioned song generation

Hybrid Learning Module-Based Transformer for Multitrack Music Generation With Music Theory

[PDF][PDF] Challenge of Singing Voice Synthesis Using Only Text-To-Speech Corpus With FIRNet Source-Filter Neural Vocoder

LNACont: Language-Normalized Affine Coupling Layer with Contrastive Learning for Cross-Lingual Multi-Speaker Text-to-Speech

LHQ-SVC: Lightweight and High Quality Singing Voice Conversion Modeling

A Dual-branch Multi-Band Neural Vocoder with Harmonic Discriminator for High-Fidelity Speech Synthesis