Enhanced human motion detection with hybrid RDA-WOA-based RNN and multiple hypothesis tracking for occlusion handling

JN Cheltha, C Sharma, D Prashar, AA Khan… - Image and Vision …, 2024 - Elsevier
Human motion detection in complex scenarios poses challenges due to occlusions. This
paper presents an integrated approach for accurate human motion detections by combining …

An improved anchor-free object detection method applied in complex scenes based on SDA-DLA34

K Sun, Y Zhen, B Zhang, Z Song - Multimedia Tools and Applications, 2024 - Springer
The anchor-free object detection CenterNet has the problems that the utilization rate of
detected object features is low, which is difficult to detect morphological changes and …

SMTDKD: A Semantic-Aware Multimodal Transformer Fusion Decoupled Knowledge Distillation Method for Action Recognition

Z Quan, Q Chen, W Wang, M Zhang, X Li… - IEEE Sensors …, 2023 - ieeexplore.ieee.org
Multimodal sensors, including vision sensors and wearable sensors, offer valuable
complementary information for accurate recognition tasks. Nonetheless, the heterogeneity …

GCD-JFSE: Graph-based class-domain knowledge joint feature selection and ensemble learning for EEG-based emotion recognition

G Luo, Y Han, W **e, F Tian, L Zhu, K Qian, X Li… - Knowledge-Based …, 2025 - Elsevier
Feature selection has demonstrated strong performance in emotion recognition using
intrasubject electroencephalography (EEG) data. However, it faces challenges due to …

Musical instrument classifier for early childhood percussion instruments

B Rufino, A Khan, T Dutta, E Biddiss - Plos one, 2024 - journals.plos.org
While the musical instrument classification task is well-studied, there remains a gap in
identifying non-pitched percussion instruments which have greater overlaps in frequency …

Classification and study of music genres with multimodal Spectro-Lyrical Embeddings for Music (SLEM)

A Mehra, A Mehra, P Narang - Multimedia Tools and Applications, 2024 - Springer
The essence of music is inherently multi-modal–with audio and lyrics going hand in hand.
However, there is very less research done to study the intricacies of the multi-modal nature …

Action knowledge graph for violence detection using audiovisual features

M Khan, M Saad, A Khan, W Gueaieb… - 2024 IEEE …, 2024 - ieeexplore.ieee.org
Detecting violent content in video frames is a crucial aspect of violence detection.
Combining visual and audio cues is often the most effective way to identify violent behavior …

[HTML][HTML] Multi-Head Attention-Enhanced Speech Recognition for Reduced Data Requirements

Y Li, Y Zhou, Z Qiu, Y Wang, J Wang, G Huang - Electronics, 2024 - mdpi.com
Automatic speech recognition (ASR) technology has reached a mature level, and improving
performance in data-scarce scenarios has become a key research focus. In this study, we …

[HTML][HTML] Facial Biosignals Time–Series Dataset (FBioT): A Visual–Temporal Facial Expression Recognition (VT-FER) Approach

JMS Souza, CSM Alves, JJF Cerqueira, WLA Oliveira… - Electronics, 2024 - mdpi.com
Visual biosignals can be used to analyze human behavioral activities and serve as a
primary resource for Facial Expression Recognition (FER). FER computational systems face …

Post-Stroke Dysarthria Voice Recognition based on Fusion Feature MSA and 1D

Y Wujian, Z Yingcong, C Yuehai, L Yijun… - Computer Methods in …, 2024 - Taylor & Francis
Post-stroke Dysarthria (PSD) is one of the common sequelae of stroke. PSD can harm
patients' quality of life and, in severe cases, be life-threatening. Most of the existing methods …