Google Академик

J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com

Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …

Сачувај Цитирај 452 пута наведен Сродни чланци Све верзије (8) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

Data augmentation techniques in time series domain: a survey and taxonomy

G Iglesias, E Talavera, Á González-Prieto… - Neural Computing and …, 2023 - Springer

With the latest advances in deep learning-based generative models, it has not taken long to
take advantage of their remarkable performance in the area of time series. Deep neural …

Сачувај Цитирај 196 пута наведен Сродни чланци Све верзије (9)

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Voicebox: Text-guided multilingual universal speech generation at scale

M Le, A Vyas, B Shi, B Karrer, L Sari… - Advances in neural …, 2023 - proceedings.neurips.cc

Large-scale generative models such as GPT and DALL-E have revolutionized the research
community. These models not only generate high fidelity outputs, but are also generalists …

Сачувај Цитирај 261 пута наведен Сродни чланци Све верзије (9) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] nih.gov

A high-performance neuroprosthesis for speech decoding and avatar control

SL Metzger, KT Littlejohn, AB Silva, DA Moses… - Nature, 2023 - nature.com

Speech neuroprostheses have the potential to restore communication to people living with
paralysis, but naturalistic speed and expressivity are elusive. Here we use high-density …

Сачувај Цитирај 276 пута наведен Сродни чланци Све верзије (11)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Qwen-audio: Advancing universal audio understanding via unified large-scale audio-language models

Y Chu, J Xu, X Zhou, Q Yang, S Zhang, Z Yan… - arxiv preprint arxiv …, 2023 - arxiv.org

Recently, instruction-following audio-language models have received broad attention for
audio interaction with humans. However, the absence of pre-trained audio models capable …

Сачувај Цитирај 245 пута наведен Сродни чланци Све верзије (2) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Robust speech recognition via large-scale weak supervision

A Radford, JW Kim, T Xu, G Brockman… - International …, 2023 - proceedings.mlr.press

We study the capabilities of speech processing systems trained simply to predict large
amounts of transcripts of audio on the internet. When scaled to 680,000 hours of multilingual …

Сачувај Цитирај 3990 пута наведен Сродни чланци Све верзије (11) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Beats: Audio pre-training with acoustic tokenizers

S Chen, Y Wu, C Wang, S Liu, D Tompkins… - arxiv preprint arxiv …, 2022 - arxiv.org

The massive growth of self-supervised learning (SSL) has been witnessed in language,
vision, speech, and audio domains over the past few years. While discrete label prediction is …

Сачувај Цитирај 292 пута наведен Сродни чланци Све верзије (9) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

End-to-end speech recognition: A survey

R Prabhavalkar, T Hori, TN Sainath… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org

In the last decade of automatic speech recognition (ASR) research, the introduction of deep
learning has brought considerable reductions in word error rate of more than 50% relative …

Сачувај Цитирај 184 пута наведен Сродни чланци Све верзије (6) Претрага библиотека

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Masked autoencoders that listen

PY Huang, H Xu, J Li, A Baevski… - Advances in …, 2022 - proceedings.neurips.cc

This paper studies a simple extension of image-based Masked Autoencoders (MAE) to self-
supervised representation learning from audio spectrograms. Following the Transformer …

Сачувај Цитирај 258 пута наведен Сродни чланци Све верзије (6) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Wavlm: Large-scale self-supervised pre-training for full stack speech processing

S Chen, C Wang, Z Chen, Y Wu, S Liu… - IEEE Journal of …, 2022 - ieeexplore.ieee.org

Self-supervised learning (SSL) achieves great success in speech recognition, while limited
exploration has been attempted for other speech processing tasks. As speech signal …

Сачувај Цитирај 1875 пута наведен Сродни чланци Све верзије (7)

Направи обавештење

Цитирај

Напредна претрага

Сачувано у мојој библиотеци

Specaugment: A simple data augmentation method for automatic speech recognition

[PDF][PDF] Recent advances in end-to-end automatic speech recognition

Data augmentation techniques in time series domain: a survey and taxonomy

Voicebox: Text-guided multilingual universal speech generation at scale

A high-performance neuroprosthesis for speech decoding and avatar control

Qwen-audio: Advancing universal audio understanding via unified large-scale audio-language models

Robust speech recognition via large-scale weak supervision

Beats: Audio pre-training with acoustic tokenizers

End-to-end speech recognition: A survey

Masked autoencoders that listen

Wavlm: Large-scale self-supervised pre-training for full stack speech processing