- Academic Search

AB Nassif, I Shahin, I Attili, M Azzeh, K Shaalan - IEEE access, 2019 - ieeexplore.ieee.org

Over the past decades, a tremendous amount of research has been done on the use of
machine learning for speech processing applications, especially speech recognition …

Save Cite Cited by 1359 Related articles All 9 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

A review of speaker diarization: Recent advances with deep learning

TJ Park, N Kanda, D Dimitriadis, KJ Han… - Computer Speech & …, 2022 - Elsevier

Speaker diarization is a task to label audio or video recordings with classes that correspond
to speaker identity, or in short, a task to identify “who spoke when”. In the early years …

Save Cite Cited by 417 Related articles All 7 versions Free GPT-4

[Free GPT-4]

[PDF] academia.edu

A survey on voice assistant security: Attacks and countermeasures

C Yan, X Ji, K Wang, Q Jiang, Z **, W Xu - ACM Computing Surveys, 2022 - dl.acm.org

Voice assistants (VA) have become prevalent on a wide range of personal devices such as
smartphones and smart speakers. As companies build voice assistants with extra …

Save Cite Cited by 64 Related articles All 2 versions Free GPT-4

[Free GPT-4]

[PDF] openreview.net

Large scale deep neural network acoustic modeling with semi-supervised training data for YouTube video transcription

H Liao, E McDermott, A Senior - 2013 IEEE Workshop on …, 2013 - ieeexplore.ieee.org

YouTube is a highly visited video sharing website where over one billion people watch six
billion hours of video every month. Improving accessibility to these videos for the hearing …

Save Cite Cited by 254 Related articles All 9 versions Free GPT-4

Voice activity detection using an adaptive context attention model

J Kim, M Hahn - IEEE Signal Processing Letters, 2018 - ieeexplore.ieee.org

Voice activity detection (VAD) classifies incoming signal segments into speech or
background noise; its performance is crucial in various speech-related applications …

Save Cite Cited by 120 Related articles All 3 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Voice activity detection in the wild: A data-driven approach using teacher-student training

H Dinkel, S Wang, X Xu, M Wu… - IEEE/ACM Transactions …, 2021 - ieeexplore.ieee.org

Voice activity detection is an essential pre-processing component for speech-related tasks
such as automatic speech recognition (ASR). Traditional supervised VAD systems obtain …

Save Cite Cited by 55 Related articles All 5 versions Free GPT-4

Optimization of RNN-based speech activity detection

G Gelly, JL Gauvain - IEEE/ACM Transactions on Audio …, 2017 - ieeexplore.ieee.org

Speech activity detection (SAD) is an essential component of automatic speech recognition
systems impacting the overall system performance. This paper investigates an optimization …

Save Cite Cited by 122 Related articles All 3 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

An open-source voice type classifier for child-centered daylong recordings

M Lavechin, R Bousbib, H Bredin, E Dupoux… - arxiv preprint arxiv …, 2020 - arxiv.org

Spontaneous conversations in real-world settings such as those found in child-centered
recordings have been shown to be amongst the most challenging audio files to process …

Save Cite Cited by 60 Related articles All 9 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

End-to-end automatic speech recognition integrated with CTC-based voice activity detection

T Yoshimura, T Hayashi, K Takeda… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org

This paper integrates a voice activity detection (VAD) function with end-to-end automatic
speech recognition toward an online speech interface and transcribing very long audio …

Save Cite Cited by 55 Related articles All 6 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

A hybrid CNN-BiLSTM voice activity detector

N Wilkinson, T Niesler - ICASSP 2021-2021 IEEE International …, 2021 - ieeexplore.ieee.org

This paper presents a new hybrid architecture for voice activity detection (VAD)
incorporating both convolutional neural network (CNN) and bidirectional long short-term …

Save Cite Cited by 42 Related articles All 4 versions Free GPT-4

Create alert

Cite

Advanced search

Saved to My library

Speech activity detection on youtube using deep neural networks.

Speech recognition using deep neural networks: A systematic review

A review of speaker diarization: Recent advances with deep learning

A survey on voice assistant security: Attacks and countermeasures

Large scale deep neural network acoustic modeling with semi-supervised training data for YouTube video transcription

Voice activity detection using an adaptive context attention model

Voice activity detection in the wild: A data-driven approach using teacher-student training

Optimization of RNN-based speech activity detection

An open-source voice type classifier for child-centered daylong recordings

End-to-end automatic speech recognition integrated with CTC-based voice activity detection

A hybrid CNN-BiLSTM voice activity detector