Enabling resource-efficient aiot system with cross-level optimization: A survey
The emerging field of artificial intelligence of things (AIoT, AI+ IoT) is driven by the
widespread use of intelligent infrastructures and the impressive success of deep learning …
widespread use of intelligent infrastructures and the impressive success of deep learning …
Wavlm: Large-scale self-supervised pre-training for full stack speech processing
Self-supervised learning (SSL) achieves great success in speech recognition, while limited
exploration has been attempted for other speech processing tasks. As speech signal …
exploration has been attempted for other speech processing tasks. As speech signal …
Continuous speech separation: Dataset and analysis
This paper describes a dataset and protocols for evaluating continuous speech separation
algorithms. Most prior speech separation studies use pre-segmented audio signals, which …
algorithms. Most prior speech separation studies use pre-segmented audio signals, which …
Complex spectral map** for single-and multi-channel speech enhancement and robust ASR
This study proposes a complex spectral map** approach for single-and multi-channel
speech enhancement, where deep neural networks (DNNs) are used to predict the real and …
speech enhancement, where deep neural networks (DNNs) are used to predict the real and …
Continuous speech separation with conformer
Continuous speech separation was recently proposed to deal with the overlapped speech in
natural conversations. While it was shown to significantly improve the speech recognition …
natural conversations. While it was shown to significantly improve the speech recognition …
Neural spectrospatial filtering
As the most widely-used spatial filtering approach for multi-channel speech separation,
beamforming extracts the target speech signal arriving from a specific direction. An …
beamforming extracts the target speech signal arriving from a specific direction. An …
Integration of speech separation, diarization, and recognition for multi-speaker meetings: System description, comparison, and analysis
Multi-speaker speech recognition of unsegmented recordings has diverse applications such
as meeting transcription and automatic subtitle generation. With technical advances in …
as meeting transcription and automatic subtitle generation. With technical advances in …
Combining spectral and spatial features for deep learning based blind speaker separation
This study tightly integrates complementary spectral and spatial features for deep learning
based multi-channel speaker separation in reverberant environments. The key idea is to …
based multi-channel speaker separation in reverberant environments. The key idea is to …
Multi-channel overlapped speech recognition with location guided speech extraction network
Z Chen, X ** for utterance-wise and continuous speech separation
We propose multi-microphone complex spectral map**, a simple way of applying deep
learning for time-varying non-linear beamforming, for speaker separation in reverberant …
learning for time-varying non-linear beamforming, for speaker separation in reverberant …