- Academic Search

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier

The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

Simpan Kutip Dirujuk 232 kali Artikel terkait 6 versi

[Free GPT-4]
[DeepSeek]

[PDF] nowpublishers.com

[PDF][PDF] Recent advances in end-to-end automatic speech recognition

J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com

Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …

Simpan Kutip Dirujuk 440 kali Artikel terkait 7 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Images speak in images: A generalist painter for in-context visual learning

X Wang, W Wang, Y Cao, C Shen… - Proceedings of the …, 2023 - openaccess.thecvf.com

In-context learning, as a new paradigm in NLP, allows the model to rapidly adapt to various
tasks with only a handful of prompts and examples. But in computer vision, the difficulties for …

Simpan Kutip Dirujuk 241 kali Artikel terkait 5 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

SpeechBrain: A general-purpose speech toolkit

M Ravanelli, T Parcollet, P Plantinga, A Rouhe… - arxiv preprint arxiv …, 2021 - arxiv.org

SpeechBrain is an open-source and all-in-one speech toolkit. It is designed to facilitate the
research and development of neural speech processing technologies by being simple …

Simpan Kutip Dirujuk 747 kali Artikel terkait 5 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Going deeper with image transformers

H Touvron, M Cord, A Sablayrolles… - Proceedings of the …, 2021 - openaccess.thecvf.com

Transformers have been recently adapted for large scale image classification, achieving
high scores shaking up the long supremacy of convolutional neural networks. However the …

Simpan Kutip Dirujuk 1217 kali Artikel terkait 5 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Branchformer: Parallel mlp-attention architectures to capture local and global context for speech recognition and understanding

Y Peng, S Dalmia, I Lane… - … Conference on Machine …, 2022 - proceedings.mlr.press

Conformer has proven to be effective in many speech processing tasks. It combines the
benefits of extracting local dependencies using convolutions and global dependencies …

Simpan Kutip Dirujuk 174 kali Artikel terkait 8 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Ai choreographer: Music conditioned 3d dance generation with aist++

R Li, S Yang, DA Ross… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com

We present AIST++, a new multi-modal dataset of 3D dance motion and music, along with
FACT, a Full-Attention Cross-modal Transformer network for generating 3D dance motion …

Simpan Kutip Dirujuk 505 kali Artikel terkait 6 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Attention is all you need in speech separation

C Subakan, M Ravanelli, S Cornell… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org

Recurrent Neural Networks (RNNs) have long been the dominant architecture in sequence-
to-sequence learning. RNNs, however, are inherently sequential models that do not allow …

Simpan Kutip Dirujuk 671 kali Artikel terkait 7 versi

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Conformer: Convolution-augmented transformer for speech recognition

A Gulati, J Qin, CC Chiu, N Parmar, Y Zhang… - arxiv preprint arxiv …, 2020 - arxiv.org

Recently Transformer and Convolution neural network (CNN) based models have shown
promising results in Automatic Speech Recognition (ASR), outperforming Recurrent neural …

Simpan Kutip Dirujuk 3547 kali Artikel terkait 12 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

End-to-end speech recognition: A survey

R Prabhavalkar, T Hori, TN Sainath… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org

In the last decade of automatic speech recognition (ASR) research, the introduction of deep
learning has brought considerable reductions in word error rate of more than 50% relative …

Simpan Kutip Dirujuk 176 kali Artikel terkait 6 versi

Buat notifikasi

Kutip

Penelusuran lanjutan

Disimpan ke Koleksi saya

A comparative study on transformer vs rnn in speech applications

A review of deep learning techniques for speech processing

[PDF][PDF] Recent advances in end-to-end automatic speech recognition

Images speak in images: A generalist painter for in-context visual learning

SpeechBrain: A general-purpose speech toolkit

Going deeper with image transformers

Branchformer: Parallel mlp-attention architectures to capture local and global context for speech recognition and understanding

Ai choreographer: Music conditioned 3d dance generation with aist++

Attention is all you need in speech separation

Conformer: Convolution-augmented transformer for speech recognition

End-to-end speech recognition: A survey