Google Academic

The chime-7 dasr challenge: Distant meeting transcription with multiple devices in diverse scenarios

S Cornell, M Wiesner, S Watanabe, D Raj… - ar** speech captured by a distant microphone array with an arbitrary …

Salvați Citați Citat de 15 ori Articole cu conținut similar Toate cele 4 versiuni

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

One model to rule them all? towards end-to-end joint speaker diarization and speech recognition

S Cornell, J Jung, S Watanabe… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org

This paper presents a novel framework for joint speaker diarization (SD) and automatic
speech recognition (ASR), named SLIDAR (sliding-window diarization-augmented …

Salvați Citați Citat de 19 ori Articole cu conținut similar Toate cele 4 versiuni

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

On word error rate definitions and their efficient computation for multi-speaker speech recognition systems

T von Neumann, C Boeddeker… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org

We propose a general framework to compute the word error rate (WER) of ASR systems that
process recordings containing multiple speakers at their input and that produce multiple …

Salvați Citați Citat de 30 ori Articole cu conținut similar Toate cele 4 versiuni

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

Conformer-based target-speaker automatic speech recognition for single-channel audio

Y Zhang, KC Puvvada, V Lavrukhin… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org

We propose CONF-TSASR, a non-autoregressive end-to-end time-frequency domain
architecture for single-channel target-speaker automatic speech recognition (TS-ASR). The …

Salvați Citați Citat de 20 ori Articole cu conținut similar Toate cele 3 versiuni

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

Streaming speaker-attributed ASR with token-level speaker embeddings

N Kanda, J Wu, Y Wu, X ** speakers in real-world
environments like meetings, but it often falls short in isolating speech segments of a single …

Salvați Citați Citat de 6 ori Articole cu conținut similar Toate cele 2 versiuni

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

A sidecar separator can convert a single-talker speech recognition system to a multi-talker one

L Meng, J Kang, M Cui, Y Wang, X Wu… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org

Although automatic speech recognition (ASR) can perform well in common non-overlap**
environments, sustaining performance in multi-talker overlap** speech recognition …

Salvați Citați Citat de 16 ori Articole cu conținut similar Toate cele 8 versiuni

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

Empowering whisper as a joint multi-talker and target-talker speech recognition system

L Meng, J Kang, Y Wang, Z **, X Wu, X Liu… - arxiv preprint arxiv …, 2024 - arxiv.org

Multi-talker speech recognition and target-talker speech recognition, both involve
transcription in multi-talker contexts, remain significant challenges. However, existing …

Salvați Citați Citat de 7 ori Articole cu conținut similar Toate cele 7 versiuni Afișare ca HTML

Creează alerta

Citați

Căutare avansată

Salvat în Bibliotecă

Streaming multi-talker ASR with token-level serialized output training

The chime-7 dasr challenge: Distant meeting transcription with multiple devices in diverse scenarios

One model to rule them all? towards end-to-end joint speaker diarization and speech recognition

On word error rate definitions and their efficient computation for multi-speaker speech recognition systems

Conformer-based target-speaker automatic speech recognition for single-channel audio

Streaming speaker-attributed ASR with token-level speaker embeddings

A sidecar separator can convert a single-talker speech recognition system to a multi-talker one

Empowering whisper as a joint multi-talker and target-talker speech recognition system