Google Академія

H Wang, S Ma, S Huang, L Dong, W Wang… - arxiv preprint arxiv …, 2022 - arxiv.org

A big convergence of model architectures across language, vision, speech, and multimodal
is emerging. However, under the same name" Transformers", the above areas use different …

Зберегти Послатися Цитовано в 34 джерелах Пов’язані статті Кількість версій: 2 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Magneto: A foundation transformer

H Wang, S Ma, S Huang, L Dong… - International …, 2023 - proceedings.mlr.press

A big convergence of model architectures across language, vision, speech, and multimodal
is emerging. However, under the same name” Transformers”, the above areas use different …

Зберегти Послатися Цитовано в 9 джерелах Пов’язані статті Кількість версій: 4 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

BERT meets CTC: New formulation of end-to-end speech recognition with pre-trained masked language model

Y Higuchi, B Yan, S Arora, T Ogawa… - arxiv preprint arxiv …, 2022 - arxiv.org

This paper presents BERT-CTC, a novel formulation of end-to-end speech recognition that
adapts BERT for connectionist temporal classification (CTC). Our formulation relaxes the …

Зберегти Послатися Цитовано в 27 джерелах Пов’язані статті Кількість версій: 6 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] sciltp.com

CCE-Net: Causal Convolution Embedding Network for Streaming Automatic Speech Recognition

F Deng, Y Ming, B Lyu - International Journal of Network Dynamics and …, 2024 - sciltp.com

Streaming Automatic Speech Recognition (ASR) has gained significant attention across
various application scenarios, including video conferencing, live sports events, and …

Зберегти Послатися Цитовано в 2 джерелах Пов’язані статті Кеш

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

Streaming end-to-end target-speaker automatic speech recognition and activity detection

T Moriya, H Sato, T Ochiai, M Delcroix… - IEEE Access, 2023 - ieeexplore.ieee.org

Automatic speech recognition of a target speaker in the presence of interfering speakers
remains a challenging issue. One approach to tackle this problem is target-speaker speech …

Зберегти Послатися Цитовано в 13 джерелах Пов’язані статті Кількість версій: 3 Web of Science: 4

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Bectra: Transducer-based end-to-end asr with bert-enhanced encoder

Y Higuchi, T Ogawa, T Kobayashi… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org

We present BERT-CTC-Transducer (BECTRA), a novel end-to-end automatic speech
recognition (E2E-ASR) model formulated by the transducer with a BERT-enhanced encoder …

Зберегти Послатися Цитовано в 15 джерелах Пов’язані статті Кількість версій: 6

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Memory-efficient training of RNN-Transducer with sampled softmax

J Lee, L Lee, S Watanabe - arxiv preprint arxiv:2203.16868, 2022 - arxiv.org

RNN-Transducer has been one of promising architectures for end-to-end automatic speech
recognition. Although RNN-Transducer has many advantages including its strong accuracy …

Зберегти Послатися Цитовано в 12 джерелах Пов’язані статті Кількість версій: 7 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[HTML] sciencedirect.com

[HTML][HTML] Decoupled structure for improved adaptability of end-to-end models

K Deng, PC Woodland - Speech Communication, 2024 - Elsevier

Although end-to-end (E2E) trainable automatic speech recognition (ASR) has shown great
success by jointly learning acoustic and linguistic information, it still suffers from the effect of …

Зберегти Послатися Цитовано в 3 джерелах Пов’язані статті Кількість версій: 5

[Free GPT-4]
[DeepSeek]

[PDF] isca-archive.org

[PDF][PDF] miniStreamer: Enhancing small conformer with chunked-context masking for streaming ASR applications on the edge

H Gulzar, MR Busto, T Eda, K Itoyama, K Nakadai - Interspeech, 2023 - isca-archive.org

Real-time applications of Automatic Speech Recognition (ASR) on user devices on the edge
require streaming processing. Conformer model has achieved state-of-the-art performance …

Зберегти Послатися Цитовано в 4 джерелах Пов’язані статті Кількість версій: 3 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] hal.science

Transformer model compression for end-to-end speech recognition on mobile devices

LB Letaifa, JL Rouas - 2022 30th European Signal Processing …, 2022 - ieeexplore.ieee.org

Transformer-based models have achieved state-of-the-art performance in various areas of
machine learning, including automatic speech recognition. However, their cost in terms of …

Зберегти Послатися Цитовано в 8 джерелах Пов’язані статті Кількість версій: 6

Створити сповіщення

Послатися

Розширений пошук

Збережено в моїй бібліотеці

A study of transducer based end-to-end ASR with ESPnet: Architecture, auxiliary loss and...

Foundation transformers

Magneto: A foundation transformer

BERT meets CTC: New formulation of end-to-end speech recognition with pre-trained masked language model

CCE-Net: Causal Convolution Embedding Network for Streaming Automatic Speech Recognition

Streaming end-to-end target-speaker automatic speech recognition and activity detection

Bectra: Transducer-based end-to-end asr with bert-enhanced encoder

Memory-efficient training of RNN-Transducer with sampled softmax

[HTML][HTML] Decoupled structure for improved adaptability of end-to-end models

[PDF][PDF] miniStreamer: Enhancing small conformer with chunked-context masking for streaming ASR applications on the edge

Transformer model compression for end-to-end speech recognition on mobile devices