[HTML][HTML] Speechformer-ctc: Sequential modeling of depression detection with speech temporal classification
Speech-based automatic depression detection systems have been extensively explored
over the past few years. Typically, each speaker is assigned a single label (Depressive or …
over the past few years. Typically, each speaker is assigned a single label (Depressive or …
CENN: Capsule-enhanced neural network with innovative metrics for robust speech emotion recognition
H Zhang, H Huang, P Zhao, X Zhu, Z Yu - Knowledge-Based Systems, 2024 - Elsevier
Speech emotion recognition (SER) plays a pivotal role in enhancing Human-computer
interaction (HCI) systems. This paper introduces a groundbreaking Capsule-enhanced …
interaction (HCI) systems. This paper introduces a groundbreaking Capsule-enhanced …
Speech emotion recognition using the novel SwinEmoNet (Shifted Window Transformer Emotion Network)
R Ramesh, VB Prahaladhan, P Nithish… - International Journal of …, 2024 - Springer
Understanding human emotions is necessary for various tasks, including interpersonal
interaction, knowledge acquisition, and determining courses of action. Recognizing …
interaction, knowledge acquisition, and determining courses of action. Recognizing …
Swin-BERT: A Feature Fusion System designed for Speech-based Alzheimer's Dementia Detection
Speech is usually used for constructing an automatic Alzheimer's dementia (AD) detection
system, as the acoustic and linguistic abilities show a decline in people living with AD at the …
system, as the acoustic and linguistic abilities show a decline in people living with AD at the …
Common Discriminative Latent Space Learning for Cross-Domain Speech Emotion Recognition
S Fu, P Song, H Wang, Z Liu… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Cross-domain speech emotion recognition (SER) has received increasing attention in recent
years. Existing transfer subspace learning and regression-based SER methods have the …
years. Existing transfer subspace learning and regression-based SER methods have the …
Discriminative feature learning based on multi-view attention network with diffusion joint loss for speech emotion recognition
Y Liu, X Chen, Y Song, Y Li, S Wang, W Yuan… - … Applications of Artificial …, 2024 - Elsevier
In speech emotion recognition, existing models often struggle to accurately classify emotions
with high similarity. In this paper, we propose a novel architecture that integrates a multi …
with high similarity. In this paper, we propose a novel architecture that integrates a multi …
DSTM: A transformer-based model with dynamic-static feature fusion in speech emotion recognition
G **, Y Xu, H Kang, J Wang, B Miao - Computer Speech & Language, 2025 - Elsevier
With the support of multi-head attention, the Transformer shows remarkable results in
speech emotion recognition. However, existing models still suffer from the inability to …
speech emotion recognition. However, existing models still suffer from the inability to …
DCEPNet: Dual-Channel Emotional Perception Network for Speech Emotion Recognition
F **ang, H Liu, R Wang, J Hou, X Wang - Proceedings of the 6th ACM …, 2024 - dl.acm.org
Speech emotion recognition has been widely used in many applications such as call centres
and mental health monitoring. However, speech emotion recognition still faces great …
and mental health monitoring. However, speech emotion recognition still faces great …
[PDF][PDF] Domain mismatch and data augmentation in speech emotion recognition
Large, pretrained model architectures have demonstrated potential in a wide range of audio
recognition and classification tasks. These architectures are increasingly being used in …
recognition and classification tasks. These architectures are increasingly being used in …
Towards Better and Privacy-Preserving Speech Modeling for Depression Detection
J Wang - 2024 - search.proquest.com
Automatic depression detection systems based on speech signals have recently garnered
significant attention. Depression modeling from speech signals, however, faces three …
significant attention. Depression modeling from speech signals, however, faces three …