Dual-signal transformation LSTM network for real-time noise suppression
NL Westhausen, BT Meyer - arxiv preprint arxiv:2005.07551, 2020 - arxiv.org
This paper introduces a dual-signal transformation LSTM network (DTLN) for real-time
speech enhancement as part of the Deep Noise Suppression Challenge (DNS-Challenge) …
speech enhancement as part of the Deep Noise Suppression Challenge (DNS-Challenge) …
Low-complexity multiclass encryption by compressed sensing
The idea that compressed sensing may be used to encrypt information from unauthorized
receivers has already been envisioned but never explored in depth since its security may …
receivers has already been envisioned but never explored in depth since its security may …
Speechlmscore: Evaluating speech generation using speech language model
While human evaluation is the most reliable metric for evaluating speech generation
systems, it is generally costly and time-consuming. Previous studies on automatic speech …
systems, it is generally costly and time-consuming. Previous studies on automatic speech …
DeepF0: End-to-end fundamental frequency estimation for music and speech signals
We propose a novel pitch estimation technique called DeepF0, which leverages the
available annotated data to directly learns from the raw audio in a data-driven manner. f 0 …
available annotated data to directly learns from the raw audio in a data-driven manner. f 0 …
Cross-domain neural pitch and periodicity estimation
Pitch is a foundational aspect of our perception of audio signals. Pitch contours are
commonly used to analyze speech and music signals and as input features for many audio …
commonly used to analyze speech and music signals and as input features for many audio …
Exploiting temporal context in CNN based multisource DOA estimation
A Bohlender, A Spriet, W Tirry… - IEEE/ACM Transactions …, 2021 - ieeexplore.ieee.org
Supervised learning methods are a powerful tool for direction of arrival (DOA) estimation
because they can cope with adverse conditions where simplified models fail. In this work, we …
because they can cope with adverse conditions where simplified models fail. In this work, we …
DNN no-reference PSTN speech quality prediction
Classic public switched telephone networks (PSTN) are often a black box for VoIP network
providers, as they have no access to performance indicators, such as delay or packet loss …
providers, as they have no access to performance indicators, such as delay or packet loss …
Instantaneous pitch estimation based on RAPT framework
E Azarov, M Vashkevich… - 2012 Proceedings of the …, 2012 - ieeexplore.ieee.org
The paper presents a pitch estimation technique based on the robust algorithm for pitch
tracking (RAPT) framework. The proposed solution provides estimation of instantaneous …
tracking (RAPT) framework. The proposed solution provides estimation of instantaneous …
Harmof0: Logarithmic scale dilated convolution for pitch estimation
Sounds, especially music, contain various harmonic components scattered in the frequency
dimension. It is difficult for normal convolutional neural networks to ob-serve these …
dimension. It is difficult for normal convolutional neural networks to ob-serve these …
Performance analysis of several pitch detection algorithms on simulated and real noisy speech data
D Jouvet, Y Laprie - 2017 25th european signal processing …, 2017 - ieeexplore.ieee.org
This paper analyses the performance of a large bunch of pitch detection algorithms on clean
and noisy speech data. Two sets of noisy speech data are considered. One corresponds to …
and noisy speech data. Two sets of noisy speech data are considered. One corresponds to …