Google Academic

Crowdmos: An approach for crowdsourcing mean opinion score studies

Turnitin 降AI改写早检测系统早降重系统 Turnitin-UK版万方检测-期刊版维普编辑部版 Grammarly检测 Paperpass检测 checkpass检测 PaperYY检测

A survey on deep learning for symbolic music generation: Representations, algorithms, evaluations, and challenges

S Ji, X Yang, J Luo - ACM Computing Surveys, 2023 - dl.acm.org

Significant progress has been made in symbolic music generation with the help of deep
learning techniques. However, the tasks covered by symbolic music generation have not …

Salvați Citați Citat de 79 ori Articole cu conținut similar Toate cele 2 versiuni

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A comprehensive survey on deep music generation: Multi-level representations, algorithms, evaluations, and future directions

S Ji, J Luo, X Yang - arxiv preprint arxiv:2011.06801, 2020 - arxiv.org

The utilization of deep learning techniques in generating various contents (such as image,
text, etc.) has become a trend. Especially music, the topic of this paper, has attracted …

Salvați Citați Citat de 195 ori Articole cu conținut similar Toate cele 3 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Simple and controllable music generation

J Copet, F Kreuk, I Gat, T Remez… - Advances in …, 2023 - proceedings.neurips.cc

We tackle the task of conditional music generation. We introduce MusicGen, a single
Language Model (LM) that operates over several streams of compressed discrete music …

Salvați Citați Citat de 483 ori Articole cu conținut similar Toate cele 8 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] jmlr.org

Scaling speech technology to 1,000+ languages

V Pratap, A Tjandra, B Shi, P Tomasello, A Babu… - Journal of Machine …, 2024 - jmlr.org

Expanding the language coverage of speech technology has the potential to improve
access to information for many more people. However, current speech technology is …

Salvați Citați Citat de 295 ori Articole cu conținut similar Toate cele 4 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Voicebox: Text-guided multilingual universal speech generation at scale

M Le, A Vyas, B Shi, B Karrer, L Sari… - Advances in neural …, 2023 - proceedings.neurips.cc

Large-scale generative models such as GPT and DALL-E have revolutionized the research
community. These models not only generate high fidelity outputs, but are also generalists …

Salvați Citați Citat de 268 ori Articole cu conținut similar Toate cele 9 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Yourtts: Towards zero-shot multi-speaker tts and zero-shot voice conversion for everyone

E Casanova, J Weber, CD Shulby… - International …, 2022 - proceedings.mlr.press

YourTTS brings the power of a multilingual approach to the task of zero-shot multi-speaker
TTS. Our method builds upon the VITS model and adds several novel modifications for zero …

Salvați Citați Citat de 453 ori Articole cu conținut similar Toate cele 7 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org Full View

Icassp 2023 deep noise suppression challenge

H Dubey, A Aazami, V Gopal, B Naderi… - IEEE Open Journal …, 2024 - ieeexplore.ieee.org

The ICASSP 2023 Deep Noise Suppression (DNS) Challenge marks the fifth edition of the
DNS challenge series. DNS challenges were organized from 2019 to 2023 to foster …

Salvați Citați Citat de 250 ori Articole cu conținut similar Toate cele 13 versiuni

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Diffwave: A versatile diffusion model for audio synthesis

Z Kong, W **, J Huang, K Zhao… - arxiv preprint arxiv …, 2020 - arxiv.org

In this work, we propose DiffWave, a versatile diffusion probabilistic model for conditional
and unconditional waveform generation. The model is non-autoregressive, and converts the …

Salvați Citați Citat de 1490 ori Articole cu conținut similar Toate cele 4 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Textually pretrained speech language models

M Hassid, T Remez, TA Nguyen, I Gat… - Advances in …, 2023 - proceedings.neurips.cc

Speech language models (SpeechLMs) process and generate acoustic data only, without
textual supervision. In this work, we propose TWIST, a method for training SpeechLMs using …

Salvați Citați Citat de 62 ori Articole cu conținut similar Toate cele 5 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Real time speech enhancement in the waveform domain

A Defossez, G Synnaeve, Y Adi - arxiv preprint arxiv:2006.12847, 2020 - arxiv.org

We present a causal speech enhancement model working on the raw waveform that runs in
real-time on a laptop CPU. The proposed model is based on an encoder-decoder …

Salvați Citați Citat de 580 ori Articole cu conținut similar Toate cele 7 versiuni Afișare ca HTML

Creează alerta

Citați

Căutare avansată

Salvat în Bibliotecă

Crowdmos: An approach for crowdsourcing mean opinion score studies

A survey on deep learning for symbolic music generation: Representations, algorithms, evaluations, and challenges

A comprehensive survey on deep music generation: Multi-level representations, algorithms, evaluations, and future directions

Simple and controllable music generation

Scaling speech technology to 1,000+ languages

Voicebox: Text-guided multilingual universal speech generation at scale

Yourtts: Towards zero-shot multi-speaker tts and zero-shot voice conversion for everyone

Icassp 2023 deep noise suppression challenge

Diffwave: A versatile diffusion model for audio synthesis

Textually pretrained speech language models

Real time speech enhancement in the waveform domain