Google Академія

K Shen, Z Ju, X Tan, Y Liu, Y Leng, L He, T Qin… - arxiv preprint arxiv …, 2023 - arxiv.org

Scaling text-to-speech (TTS) to large-scale, multi-speaker, and in-the-wild datasets is
important to capture the diversity in human speech such as speaker identities, prosodies …

Зберегти Послатися Цитовано в 229 джерелах Пов’язані статті Кількість версій: 4 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Hybrid transformers for music source separation

S Rouard, F Massa, A Défossez - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org

A natural question arising in Music Source Separation (MSS) is whether long range
contextual information is useful, or whether local acoustic features are sufficient. In other …

Зберегти Послатися Цитовано в 186 джерелах Пов’язані статті Кількість версій: 3

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Music source separation with band-split RNN

Y Luo, J Yu - IEEE/ACM Transactions on Audio, Speech, and …, 2023 - ieeexplore.ieee.org

The performance of music source separation (MSS) models has been greatly improved in
recent years thanks to the development of novel neural network architectures and training …

Зберегти Послатися Цитовано в 112 джерелах Пов’язані статті Кількість версій: 4

[Free GPT-4]
[DeepSeek]

[PDF] frontiersin.org

Music demixing challenge 2021

Y Mitsufuji, G Fabbro, S Uhlich, FR Stöter… - Frontiers in Signal …, 2022 - frontiersin.org

Music source separation has been intensively studied in the last decade and tremendous
progress with the advent of deep learning could be observed. Evaluation campaigns such …

Зберегти Послатися Цитовано в 97 джерелах Пов’язані статті Кількість версій: 6 Кеш

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Multi-source diffusion models for simultaneous music generation and separation

G Mariani, I Tallini, E Postolache, M Mancusi… - arxiv preprint arxiv …, 2023 - arxiv.org

In this work, we define a diffusion-based generative model capable of both music synthesis
and source separation by learning the score of the joint probability density of sources …

Зберегти Послатися Цитовано в 40 джерелах Пов’язані статті Кількість версій: 5 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] kyoto-u.ac.jp

Waveform-domain speech enhancement using spectrogram encoding for robust speech recognition

H Shi, M Mimura, T Kawahara - IEEE/ACM Transactions on …, 2024 - ieeexplore.ieee.org

While waveform-domain speech enhancement (SE) has been extensively investigated in
recent years and achieves state-of-the-art performance in many datasets, spectrogram …

Зберегти Послатися Цитовано в 12 джерелах Пов’язані статті Кількість версій: 4

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Songcreator: Lyrics-based universal song generation

S Lei, Y Zhou, B Tang, MWY Lam… - Advances in …, 2025 - proceedings.neurips.cc

Music is an integral part of human culture, embodying human intelligence and creativity, of
which songs compose an essential part. While various aspects of song generation have …

Зберегти Послатися Цитовано в 3 джерелах Пов’язані статті Кількість версій: 5 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Aero: Audio super resolution in the spectral domain

M Mandel, O Tal, Y Adi - ICASSP 2023-2023 IEEE International …, 2023 - ieeexplore.ieee.org

We present AERO, a audio super-resolution model that processes speech and music
signals in the spectral domain. AERO is based on an encoder-decoder architecture with …

Зберегти Послатися Цитовано в 31 джерелах Пов’язані статті Кількість версій: 5

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The Sound Demixing Challenge 2023$\unicode {x2013} $ Music Demixing Track

G Fabbro, S Uhlich, CH Lai, W Choi… - arxiv preprint arxiv …, 2023 - arxiv.org

This paper summarizes the music demixing (MDX) track of the Sound Demixing Challenge
(SDX'23). We provide a summary of the challenge setup and introduce the task of robust …

Зберегти Послатися Цитовано в 20 джерелах Пов’язані статті Кількість версій: 8 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch

J Hwang, M Hira, C Chen, X Zhang, Z Ni… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org

TorchAudio is an open-source audio and speech processing library built for PyTorch. It aims
to accelerate the research and development of audio and speech technologies by providing …

Зберегти Послатися Цитовано в 17 джерелах Пов’язані статті Кількість версій: 6

Створити сповіщення

Послатися

Розширений пошук

Збережено в моїй бібліотеці

Hybrid spectrogram and waveform source separation

Naturalspeech 2: Latent diffusion models are natural and zero-shot speech and singing synthesizers

Hybrid transformers for music source separation

Music source separation with band-split RNN

Music demixing challenge 2021

Multi-source diffusion models for simultaneous music generation and separation

Waveform-domain speech enhancement using spectrogram encoding for robust speech recognition

Songcreator: Lyrics-based universal song generation

Aero: Audio super resolution in the spectral domain

The Sound Demixing Challenge 2023$\unicode {x2013} $ Music Demixing Track

TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch