Deep spoken keyword spotting: An overview
Spoken keyword spotting (KWS) deals with the identification of keywords in audio streams
and has become a fast-growing technology thanks to the paradigm shift introduced by deep …
and has become a fast-growing technology thanks to the paradigm shift introduced by deep …
A comparative study on transformer vs rnn in speech applications
Sequence-to-sequence models have been widely used in end-to-end speech processing,
for example, automatic speech recognition (ASR), speech translation (ST), and text-to …
for example, automatic speech recognition (ASR), speech translation (ST), and text-to …
CHiME-6 challenge: Tackling multispeaker speech recognition for unsegmented recordings
S Watanabe, M Mandel, J Barker, E Vincent… - ar** for utterance-wise and continuous speech separation
We propose multi-microphone complex spectral map**, a simple way of applying deep
learning for time-varying non-linear beamforming, for speaker separation in reverberant …
learning for time-varying non-linear beamforming, for speaker separation in reverberant …
Voices obscured in complex environmental settings (voices) corpus
C Richey, MA Barrios, Z Armstrong, C Bartels… - arxiv preprint arxiv …, 2018 - arxiv.org
This paper introduces the Voices Obscured In Complex Environmental Settings (VOICES)
corpus, a freely available dataset under Creative Commons BY 4.0. This dataset will …
corpus, a freely available dataset under Creative Commons BY 4.0. This dataset will …