Google Академія

J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com

Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …

Зберегти Послатися Цитовано в 448 джерелах Пов’язані статті Кількість версій: 8 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

End-to-end speech recognition: A survey

R Prabhavalkar, T Hori, TN Sainath… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org

In the last decade of automatic speech recognition (ASR) research, the introduction of deep
learning has brought considerable reductions in word error rate of more than 50% relative …

Зберегти Послатися Цитовано в 181 джерелах Пов’язані статті Кількість версій: 6 Пошук бібліотеки

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Branchformer: Parallel mlp-attention architectures to capture local and global context for speech recognition and understanding

Y Peng, S Dalmia, I Lane… - … Conference on Machine …, 2022 - proceedings.mlr.press

Conformer has proven to be effective in many speech processing tasks. It combines the
benefits of extracting local dependencies using convolutions and global dependencies …

Зберегти Послатися Цитовано в 179 джерелах Пов’язані статті Кількість версій: 8 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Squeezeformer: An efficient transformer for automatic speech recognition

S Kim, A Gholami, A Shaw, N Lee… - Advances in …, 2022 - proceedings.neurips.cc

The recently proposed Conformer model has become the de facto backbone model for
various downstream speech tasks based on its hybrid attention-convolution architecture that …

Зберегти Послатися Цитовано в 141 джерелах Пов’язані статті Кількість версій: 7 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Wenetspeech: A 10000+ hours multi-domain mandarin corpus for speech recognition

B Zhang, H Lv, P Guo, Q Shao, C Yang… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

In this paper, we present WenetSpeech, a multi-domain Mandarin corpus consisting of
10000+ hours high-quality labeled speech, 2400+ hours weakly labeled speech, and about …

Зберегти Послатися Цитовано в 223 джерелах Пов’язані статті Кількість версій: 6

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Gigaspeech: An evolving, multi-domain asr corpus with 10,000 hours of transcribed audio

G Chen, S Chai, G Wang, J Du, WQ Zhang… - arxiv preprint arxiv …, 2021 - arxiv.org

This paper introduces GigaSpeech, an evolving, multi-domain English speech recognition
corpus with 10,000 hours of high quality labeled audio suitable for supervised training, and …

Зберегти Послатися Цитовано в 252 джерелах Пов’язані статті Кількість версій: 8 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

E-branchformer: Branchformer with enhanced merging for speech recognition

K Kim, F Wu, Y Peng, J Pan, P Sridhar… - 2022 IEEE Spoken …, 2023 - ieeexplore.ieee.org

Conformer, combining convolution and self-attention sequentially to capture both local and
global information, has shown remarkable performance and is currently regarded as the …

Зберегти Послатися Цитовано в 117 джерелах Пов’язані статті Кількість версій: 5

[Free GPT-4]
[DeepSeek]

[HTML] sciencedirect.com

[HTML][HTML] Towards inclusive automatic speech recognition

S Feng, BM Halpern, O Kudina… - Computer Speech & …, 2024 - Elsevier

Practice and recent evidence show that state-of-the-art (SotA) automatic speech recognition
(ASR) systems do not perform equally well for all speaker groups. Many factors can cause …

Зберегти Послатися Цитовано в 65 джерелах Пов’язані статті Кількість версій: 8

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Fast conformer with linearly scalable attention for efficient speech recognition

D Rekesh, NR Koluguri, S Kriman… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org

Conformer-based models have become the dominant end-to-end architecture for speech
processing tasks. With the objective of enhancing the conformer architecture for efficient …

Зберегти Послатися Цитовано в 83 джерелах Пов’язані статті Кількість версій: 3

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The singing voice conversion challenge 2023

WC Huang, LP Violeta, S Liu, J Shi… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org

We present the latest iteration of the voice conversion challenge (VCC) series, a bi-annual
scientific event aiming to compare and understand different voice conversion (VC) systems …

Зберегти Послатися Цитовано в 58 джерелах Пов’язані статті Кількість версій: 4

Створити сповіщення

Послатися

Розширений пошук

Збережено в моїй бібліотеці

Recent developments on espnet toolkit boosted by conformer

[PDF][PDF] Recent advances in end-to-end automatic speech recognition

End-to-end speech recognition: A survey

Branchformer: Parallel mlp-attention architectures to capture local and global context for speech recognition and understanding

Squeezeformer: An efficient transformer for automatic speech recognition

Wenetspeech: A 10000+ hours multi-domain mandarin corpus for speech recognition

Gigaspeech: An evolving, multi-domain asr corpus with 10,000 hours of transcribed audio

E-branchformer: Branchformer with enhanced merging for speech recognition

[HTML][HTML] Towards inclusive automatic speech recognition

Fast conformer with linearly scalable attention for efficient speech recognition

The singing voice conversion challenge 2023