Mfa-conformer: Multi-scale feature aggregation conformer for automatic speaker verification
Miipher: A robust speech restoration model integrating self-supervised speech and text representations
Speech restoration (SR) is a task of converting degraded speech signals into high-quality
ones. In this study, we propose a robust SR model called Miipher, and apply Miipher to a …
ones. In this study, we propose a robust SR model called Miipher, and apply Miipher to a …
NSE-CATNet: deep neural speech enhancement using convolutional attention transformer network
Speech enhancement (SE) is a critical aspect of various speech-processing applications.
Recent research in this field focuses on identifying effective ways to capture the long-term …
Recent research in this field focuses on identifying effective ways to capture the long-term …
DeFT-AN: Dense frequency-time attentive network for multichannel speech enhancement
In this study, we propose a dense frequency-time attentive network (DeFT-AN) for
multichannel speech enhancement. DeFT-AN is a mask estimation network that predicts a …
multichannel speech enhancement. DeFT-AN is a mask estimation network that predicts a …
Exploring self-attention mechanisms for speech separation
Transformers have enabled impressive improvements in deep learning. They often
outperform recurrent and convolutional models in many tasks while taking advantage of …
outperform recurrent and convolutional models in many tasks while taking advantage of …