Dual-signal transformation LSTM network for real-time noise suppression

NL Westhausen, BT Meyer - arxiv preprint arxiv:2005.07551, 2020 - arxiv.org
This paper introduces a dual-signal transformation LSTM network (DTLN) for real-time
speech enhancement as part of the Deep Noise Suppression Challenge (DNS-Challenge) …

Low-complexity multiclass encryption by compressed sensing

V Cambareri, M Mangia, F Pareschi… - IEEE transactions on …, 2015 - ieeexplore.ieee.org
The idea that compressed sensing may be used to encrypt information from unauthorized
receivers has already been envisioned but never explored in depth since its security may …

Speechlmscore: Evaluating speech generation using speech language model

S Maiti, Y Peng, T Saeki… - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org
While human evaluation is the most reliable metric for evaluating speech generation
systems, it is generally costly and time-consuming. Previous studies on automatic speech …

DeepF0: End-to-end fundamental frequency estimation for music and speech signals

S Singh, R Wang, Y Qiu - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
We propose a novel pitch estimation technique called DeepF0, which leverages the
available annotated data to directly learns from the raw audio in a data-driven manner. f 0 …

Cross-domain neural pitch and periodicity estimation

M Morrison, C Hsieh, N Pruyne, B Pardo - arxiv preprint arxiv:2301.12258, 2023 - arxiv.org
Pitch is a foundational aspect of our perception of audio signals. Pitch contours are
commonly used to analyze speech and music signals and as input features for many audio …

Exploiting temporal context in CNN based multisource DOA estimation

A Bohlender, A Spriet, W Tirry… - IEEE/ACM Transactions …, 2021 - ieeexplore.ieee.org
Supervised learning methods are a powerful tool for direction of arrival (DOA) estimation
because they can cope with adverse conditions where simplified models fail. In this work, we …

DNN no-reference PSTN speech quality prediction

G Mittag, R Cutler, Y Hosseinkashi, M Revow… - arxiv preprint arxiv …, 2020 - arxiv.org
Classic public switched telephone networks (PSTN) are often a black box for VoIP network
providers, as they have no access to performance indicators, such as delay or packet loss …

Instantaneous pitch estimation based on RAPT framework

E Azarov, M Vashkevich… - 2012 Proceedings of the …, 2012 - ieeexplore.ieee.org
The paper presents a pitch estimation technique based on the robust algorithm for pitch
tracking (RAPT) framework. The proposed solution provides estimation of instantaneous …

Harmof0: Logarithmic scale dilated convolution for pitch estimation

W Wei, P Li, Y Yu, W Li - 2022 IEEE International Conference …, 2022 - ieeexplore.ieee.org
Sounds, especially music, contain various harmonic components scattered in the frequency
dimension. It is difficult for normal convolutional neural networks to ob-serve these …

Performance analysis of several pitch detection algorithms on simulated and real noisy speech data

D Jouvet, Y Laprie - 2017 25th european signal processing …, 2017 - ieeexplore.ieee.org
This paper analyses the performance of a large bunch of pitch detection algorithms on clean
and noisy speech data. Two sets of noisy speech data are considered. One corresponds to …