- Academic Search

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier

The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

Opslaan Citeren Geciteerd door 236 Verwante artikelen Alle 6 versies

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

High fidelity neural audio compression

A Défossez, J Copet, G Synnaeve, Y Adi - arxiv preprint arxiv:2210.13438, 2022 - arxiv.org

We introduce a state-of-the-art real-time, high-fidelity, audio codec leveraging neural
networks. It consists in a streaming encoder-decoder architecture with quantized latent …

Opslaan Citeren Geciteerd door 711 Verwante artikelen Alle 3 versies HTML-versie

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Attention is all you need in speech separation

C Subakan, M Ravanelli, S Cornell… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org

Recurrent Neural Networks (RNNs) have long been the dominant architecture in sequence-
to-sequence learning. RNNs, however, are inherently sequential models that do not allow …

Opslaan Citeren Geciteerd door 673 Verwante artikelen Alle 7 versies

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Real time speech enhancement in the waveform domain

A Defossez, G Synnaeve, Y Adi - arxiv preprint arxiv:2006.12847, 2020 - arxiv.org

We present a causal speech enhancement model working on the raw waveform that runs in
real-time on a laptop CPU. The proposed model is based on an encoder-decoder …

Opslaan Citeren Geciteerd door 568 Verwante artikelen Alle 8 versies HTML-versie

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Visualvoice: Audio-visual speech separation with cross-modal consistency

R Gao, K Grauman - 2021 IEEE/CVF Conference on Computer …, 2021 - ieeexplore.ieee.org

We introduce a new approach for audio-visual speech separation. Given a video, the goal is
to extract the speech associated with a face in spite of simultaneous back-ground sounds …

Opslaan Citeren Geciteerd door 199 Verwante artikelen Alle 9 versies

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis

P Ochieng - Artificial Intelligence Review, 2023 - Springer

Deep neural networks (DNN) techniques have become pervasive in domains such as
natural language processing and computer vision. They have achieved great success in …

Opslaan Citeren Geciteerd door 27 Verwante artikelen Alle 8 versies

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Wavesplit: End-to-end speech separation by speaker clustering

N Zeghidour, D Grangier - IEEE/ACM Transactions on Audio …, 2021 - ieeexplore.ieee.org

We introduce Wavesplit, an end-to-end source separation system. From a single mixture, the
model infers a representation for each source and then estimates each source signal given …

Opslaan Citeren Geciteerd door 312 Verwante artikelen Alle 8 versies

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Music source separation in the waveform domain

A Défossez, N Usunier, L Bottou, F Bach - arxiv preprint arxiv:1911.13254, 2019 - arxiv.org

Source separation for music is the task of isolating contributions, or stems, from different
instruments recorded individually and arranged together to form a song. Such components …

Opslaan Citeren Geciteerd door 335 Verwante artikelen Alle 13 versies HTML-versie

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Unsupervised sound separation using mixture invariant training

S Wisdom, E Tzinis, H Erdogan… - Advances in neural …, 2020 - proceedings.neurips.cc

In recent years, rapid progress has been made on the problem of single-channel sound
separation using supervised training of deep neural networks. In such supervised …

Opslaan Citeren Geciteerd door 215 Verwante artikelen Alle 9 versies HTML-versie

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

TF-GridNet: Integrating full-and sub-band modeling for speech separation

ZQ Wang, S Cornell, S Choi, Y Lee… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org

We propose TF-GridNet for speech separation. The model is a novel deep neural network
(DNN) integrating full-and sub-band modeling in the time-frequency (TF) domain. It stacks …

Opslaan Citeren Geciteerd door 116 Verwante artikelen Alle 7 versies

Melding maken

Citeren

Geavanceerd zoeken

Opgeslagen in Mijn bibliotheek

Voice separation with an unknown number of multiple speakers

A review of deep learning techniques for speech processing

High fidelity neural audio compression

Attention is all you need in speech separation

Real time speech enhancement in the waveform domain

Visualvoice: Audio-visual speech separation with cross-modal consistency

Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis

Wavesplit: End-to-end speech separation by speaker clustering

Music source separation in the waveform domain

Unsupervised sound separation using mixture invariant training

TF-GridNet: Integrating full-and sub-band modeling for speech separation