- Academic Search

Z Evans, CJ Carr, J Taylor, SH Hawley… - Forty-first International …, 2024 - openreview.net

Generating long-form 44.1 kHz stereo audio from text prompts can be computationally
demanding. Further, most previous works do not tackle that music and sound effects …

Tallenna Viittaa Viittausten määrä 90 Aiheeseen liittyviä artikkeleita Kaikki 8 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Speech enhancement and dereverberation with diffusion-based generative models

J Richter, S Welker, JM Lemercier… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org

In this work, we build upon our previous publication and use diffusion-based generative
models for speech enhancement. We present a detailed overview of the diffusion process …

Tallenna Viittaa Viittausten määrä 183 Aiheeseen liittyviä artikkeleita Kaikki 6 versiota

[Free GPT-4]
[DeepSeek]

[HTML] sciencedirect.com

[HTML][HTML] Hybrid flexible (HyFlex) seminar delivery–A technical overview of the implementation

R Sanchez-Pizani, M Detyna, S Dance… - Building and …, 2022 - Elsevier

This paper investigates a new technology for Hybrid flexible delivery (known as HyFlex), as
implemented at King's College London. The relatively novel character of HyFlex, of mixing …

Tallenna Viittaa Viittausten määrä 29 Aiheeseen liittyviä artikkeleita Kaikki 7 versiota

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Adversarial attacks against automatic speech recognition systems via psychoacoustic hiding

L Schönherr, K Kohls, S Zeiler, T Holz… - arxiv preprint arxiv …, 2018 - arxiv.org

Voice interfaces are becoming accepted widely as input methods for a diverse set of
devices. This development is driven by rapid improvements in automatic speech recognition …

Tallenna Viittaa Viittausten määrä 370 Aiheeseen liittyviä artikkeleita Kaikki 10 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Long-form music generation with latent diffusion

Z Evans, JD Parker, CJ Carr, Z Zukowski… - arxiv preprint arxiv …, 2024 - arxiv.org

Audio-based generative models for music have seen great strides recently, but so far have
not managed to produce full-length music tracks with coherent musical structure from text …

Tallenna Viittaa Viittausten määrä 37 Aiheeseen liittyviä artikkeleita Kaikki 4 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[HTML] mdpi.com

[HTML][HTML] A review of neural network-based emulation of guitar amplifiers

T Vanhatalo, P Legrand, M Desainte-Catherine… - Applied Sciences, 2022 - mdpi.com

Vacuum tube amplifiers present sonic characteristics frequently coveted by musicians, that
are often due to the distinct nonlinearities of their circuits, and accurately modelling such …

Tallenna Viittaa Viittausten määrä 21 Aiheeseen liittyviä artikkeleita Kaikki 5 versiota Välimuistissa

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Insights into deep non-linear filters for improved multi-channel speech enhancement

K Tesch, T Gerkmann - IEEE/ACM Transactions on Audio …, 2022 - ieeexplore.ieee.org

The key advantage of using multiple microphones for speech enhancement is that spatial
filtering can be used to complement the tempo-spectral processing. In a traditional setting …

Tallenna Viittaa Viittausten määrä 59 Aiheeseen liittyviä artikkeleita Kaikki 5 versiota

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Espnet2-tts: Extending the edge of tts research

T Hayashi, R Yamamoto, T Yoshimura, P Wu… - arxiv preprint arxiv …, 2021 - arxiv.org

This paper describes ESPnet2-TTS, an end-to-end text-to-speech (E2E-TTS) toolkit.
ESPnet2-TTS extends our earlier version, ESPnet-TTS, by adding many new features …

Tallenna Viittaa Viittausten määrä 71 Aiheeseen liittyviä artikkeleita Kaikki 3 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

An open source implementation of itu-t recommendation p. 808 with validation

B Naderi, R Cutler - arxiv preprint arxiv:2005.08138, 2020 - arxiv.org

The ITU-T Recommendation P. 808 provides a crowdsourcing approach for conducting a
subjective assessment of speech quality using the Absolute Category Rating (ACR) method …

Tallenna Viittaa Viittausten määrä 93 Aiheeseen liittyviä artikkeleita Kaikki 6 versiota HTML-versio

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Differentiable artificial reverberation

S Lee, HS Choi, K Lee - IEEE/ACM Transactions on Audio …, 2022 - ieeexplore.ieee.org

Artificial reverberation (AR) models play a central role in various audio applications.
Therefore, estimating the AR model parameters (ARPs) of a reference reverberation is a …

Tallenna Viittaa Viittausten määrä 53 Aiheeseen liittyviä artikkeleita Kaikki 4 versiota

Luo ilmoitus

Viittaa

Tarkennettu haku

Tallennettu omaan kirjastoon

webMUSHRA—A comprehensive framework for web-based listening tests

Fast timing-conditioned latent audio diffusion

Speech enhancement and dereverberation with diffusion-based generative models

[HTML][HTML] Hybrid flexible (HyFlex) seminar delivery–A technical overview of the implementation

Adversarial attacks against automatic speech recognition systems via psychoacoustic hiding

Long-form music generation with latent diffusion

[HTML][HTML] A review of neural network-based emulation of guitar amplifiers

Insights into deep non-linear filters for improved multi-channel speech enhancement

Espnet2-tts: Extending the edge of tts research

An open source implementation of itu-t recommendation p. 808 with validation

Differentiable artificial reverberation