Speech emotion recognition with deep convolutional neural networks

D Issa, MF Demirci, A Yazici - Biomedical Signal Processing and Control, 2020‏ - Elsevier
The speech emotion recognition (or, classification) is one of the most challenging topics in
data science. In this work, we introduce a new architecture, which extracts mel-frequency …

Speech emotion recognition with co-attention based multi-level acoustic information

H Zou, Y Si, C Chen, D Rajan… - ICASSP 2022-2022 IEEE …, 2022‏ - ieeexplore.ieee.org
Speech Emotion Recognition (SER) aims to help the machine to understand human's
subjective emotion from only audio in-formation. However, extracting and utilizing …

Survey of deep representation learning for speech emotion recognition

S Latif, R Rana, S Khalifa, R Jurdak… - IEEE Transactions …, 2021‏ - ieeexplore.ieee.org
Traditionally, speech emotion recognition (SER) research has relied on manually
handcrafted acoustic features using feature engineering. However, the design of …

[HTML][HTML] Speech emotion recognition using fusion of three multi-task learning-based classifiers: HSF-DNN, MS-CNN and LLD-RNN

Z Yao, Z Wang, W Liu, Y Liu, J Pan - Speech Communication, 2020‏ - Elsevier
Speech emotion recognition plays an increasingly important role in emotional computing
and is still a challenging task due to its complexity. In this study, we developed a framework …

Wavoice: A noise-resistant multi-modal speech recognition system fusing mmwave and audio signals

T Liu, M Gao, F Lin, C Wang, Z Ba, J Han… - Proceedings of the 19th …, 2021‏ - dl.acm.org
With the advance in automatic speech recognition, voice user interface has gained
popularity recently. Since the COVID-19 pandemic, VUI is increasingly preferred in online …

Combining a parallel 2D CNN with a self-attention Dilated Residual Network for CTC-based discrete speech emotion recognition

Z Zhao, Q Li, Z Zhang, N Cummins, H Wang, J Tao… - Neural Networks, 2021‏ - Elsevier
A challenging issue in the field of the automatic recognition of emotion from speech is the
efficient modelling of long temporal contexts. Moreover, when incorporating long-term …

Designing and Evaluating Speech Emotion Recognition Systems: A reality check case study with IEMOCAP

N Antoniou, A Katsamanis… - ICASSP 2023-2023 …, 2023‏ - ieeexplore.ieee.org
There is an imminent need for guidelines and standard test sets to allow direct and fair
comparisons of speech emotion recognition (SER). While resources, such as the Interactive …

Head fusion: Improving the accuracy and robustness of speech emotion recognition on the IEMOCAP and RAVDESS dataset

M Xu, F Zhang, W Zhang - IEEE Access, 2021‏ - ieeexplore.ieee.org
Speech Emotion Recognition (SER) refers to the use of machines to recognize the emotions
of a speaker from his (or her) speech. SER benefits Human-Computer Interaction (HCI). But …

Speech emotion recognition with multiscale area attention and data augmentation

M Xu, F Zhang, X Cui, W Zhang - ICASSP 2021-2021 IEEE …, 2021‏ - ieeexplore.ieee.org
In Speech Emotion Recognition (SER), emotional characteristics often appear in diverse
forms of energy patterns in spectrograms. Typical attention neural network classifiers of SER …

Isnet: Individual standardization network for speech emotion recognition

W Fan, X Xu, B Cai, X **ng - IEEE/ACM Transactions on Audio …, 2022‏ - ieeexplore.ieee.org
Speech emotion recognition plays an essential role in human-computer interaction.
However, cross-individual representation learning and individual-agnostic systems are …