A comprehensive survey and analysis of generative models in machine learning

GM Harshvardhan, MK Gourisaria, M Pandey… - Computer Science …, 2020 - Elsevier
Generative models have been in existence for many decades. In the field of machine
learning, we come across many scenarios when directly learning a target is intractable …

A survey of audio classification using deep learning

K Zaman, M Sah, C Direkoglu, M Unoki - IEEE Access, 2023 - ieeexplore.ieee.org
Deep learning can be used for audio signal classification in a variety of ways. It can be used
to detect and classify various types of audio signals such as speech, music, and …

A review on speech emotion recognition using deep learning and attention mechanism

E Lieskovská, M Jakubec, R Jarina, M Chmulík - Electronics, 2021 - mdpi.com
Emotions are an integral part of human interactions and are significant factors in determining
user satisfaction or customer opinion. speech emotion recognition (SER) modules also play …

Survey of deep representation learning for speech emotion recognition

S Latif, R Rana, S Khalifa, R Jurdak… - IEEE Transactions …, 2021 - ieeexplore.ieee.org
Traditionally, speech emotion recognition (SER) research has relied on manually
handcrafted acoustic features using feature engineering. However, the design of …

Speech technology for healthcare: Opportunities, challenges, and state of the art

S Latif, J Qadir, A Qayyum, M Usama… - IEEE Reviews in …, 2020 - ieeexplore.ieee.org
Speech technology is not appropriately explored even though modern advances in speech
technology—especially those driven by deep learning (DL) technology—offer …

Towards learning a universal non-semantic representation of speech

J Shor, A Jansen, R Maor, O Lang, O Tuval… - arxiv preprint arxiv …, 2020 - arxiv.org
The ultimate goal of transfer learning is to reduce labeled data requirements by exploiting a
pre-existing embedding model trained for different datasets or tasks. The visual and …

Att-Net: Enhanced emotion recognition system using lightweight self-attention module

S Kwon - Applied Soft Computing, 2021 - Elsevier
Speech emotion recognition (SER) is an active research field of digital signal processing
and plays a crucial role in numerous applications of Human–computer interaction (HCI) …

Jointly fine-tuning" bert-like" self supervised models to improve multimodal speech emotion recognition

S Siriwardhana, A Reis, R Weerasekera… - arxiv preprint arxiv …, 2020 - arxiv.org
Multimodal emotion recognition from speech is an important area in affective computing.
Fusing multiple data modalities and learning representations with limited amounts of labeled …