Deep representation learning in speech processing: Challenges, recent advances, and future trends

S Latif, R Rana, S Khalifa, R Jurdak, J Qadir… - arxiv preprint arxiv …, 2020 - arxiv.org
Research on speech processing has traditionally considered the task of designing hand-
engineered acoustic features (feature engineering) as a separate distinct problem from the …

Survey of deep representation learning for speech emotion recognition

S Latif, R Rana, S Khalifa, R Jurdak… - IEEE Transactions …, 2021 - ieeexplore.ieee.org
Traditionally, speech emotion recognition (SER) research has relied on manually
handcrafted acoustic features using feature engineering. However, the design of …

Towards learning a universal non-semantic representation of speech

J Shor, A Jansen, R Maor, O Lang, O Tuval… - arxiv preprint arxiv …, 2020 - arxiv.org
The ultimate goal of transfer learning is to reduce labeled data requirements by exploiting a
pre-existing embedding model trained for different datasets or tasks. The visual and …

Domain adversarial for acoustic emotion recognition

M Abdelwahab, C Busso - IEEE/ACM Transactions on Audio …, 2018 - ieeexplore.ieee.org
The performance of speech emotion recognition is affected by the differences in data
distributions between train (source domain) and test (target domain) sets used to build and …

Smin: Semi-supervised multi-modal interaction network for conversational emotion recognition

Z Lian, B Liu, J Tao - IEEE Transactions on Affective Computing, 2022 - ieeexplore.ieee.org
Conversational emotion recognition is a crucial research topic in human-computer
interactions. Due to the heavy annotation cost and inevitable label ambiguity, collecting …

Improving speech emotion recognition with unsupervised representation learning on unlabeled speech

M Neumann, NT Vu - ICASSP 2019-2019 IEEE international …, 2019 - ieeexplore.ieee.org
In this paper we present our findings on how representation learning on large unlabeled
speech corpora can be beneficially utilized for speech emotion recognition (SER). Prior work …

On the evolution of speech representations for affective computing: A brief history and critical overview

S Alisamir, F Ringeval - IEEE Signal Processing Magazine, 2021 - ieeexplore.ieee.org
Recent advances in the field of machine learning have shown great potential for the
automatic recognition of apparent human emotions. In the era of Internet of Things and big …

Multi-task semi-supervised adversarial autoencoding for speech emotion recognition

S Latif, R Rana, S Khalifa, R Jurdak… - IEEE Transactions …, 2020 - ieeexplore.ieee.org
Inspite the emerging importance of Speech Emotion Recognition (SER), the state-of-the-art
accuracy is quite low and needs improvement to make commercial applications of SER …

Semi-supervised speech emotion recognition with ladder networks

S Parthasarathy, C Busso - IEEE/ACM transactions on audio …, 2020 - ieeexplore.ieee.org
Speech emotion recognition (SER) systems find applications in various fields such as
healthcare, education, and security and defense. A major drawback of these systems is their …

End-to-end audiovisual speech recognition system with multitask learning

F Tao, C Busso - IEEE Transactions on Multimedia, 2020 - ieeexplore.ieee.org
An automatic speech recognition (ASR) system is a key component in current speech-based
systems. However, the surrounding acoustic noise can severely degrade the performance of …