Multimodal emotion recognition on RAVDESS dataset using transfer learning

C Luna-Jiménez, D Griol, Z Callejas, R Kleinlein… - Sensors, 2021 - mdpi.com
Emotion Recognition is attracting the attention of the research community due to the multiple
areas where it can be applied, such as in healthcare or in road safety systems. In this paper …

MUGen: Multi-modal Music Understanding and Generation with the Power of Large Language Models

S Liu, AS Hussain, C Sun, Y Shan - arxiv preprint arxiv:2311.11255, 2023 - arxiv.org
The current landscape of research leveraging large language models (LLMs) is
experiencing a surge. Many works harness the powerful reasoning capabilities of these …

Strong labeling of sound events using crowdsourced weak labels and annotator competence estimation

I Martín-Morató, A Mesaros - IEEE/ACM transactions on audio …, 2023 - ieeexplore.ieee.org
Crowdsourcing is a popular tool for collecting large amounts of annotated data, but the
specific format of the strong labels necessary for sound event detection is not easily …

A comprehensive survey of automated audio captioning

X Xu, M Wu, K Yu - arxiv preprint arxiv:2205.05357, 2022 - arxiv.org
Automated audio captioning, a task that mimics human perception as well as innovatively
links audio processing and natural language processing, has overseen much progress over …

Improving crisis events detection using distilbert with hunger games search algorithm

H Adel, A Dahou, A Mabrouk, M Abd Elaziz, M Kayed… - Mathematics, 2022 - mdpi.com
This paper presents an alternative event detection model based on the integration between
the DistilBERT and a new meta-heuristic technique named the Hunger Games Search …

Voice activity detection in the wild: A data-driven approach using teacher-student training

H Dinkel, S Wang, X Xu, M Wu… - IEEE/ACM Transactions …, 2021 - ieeexplore.ieee.org
Voice activity detection is an essential pre-processing component for speech-related tasks
such as automatic speech recognition (ASR). Traditional supervised VAD systems obtain …

Blockchain-based event detection and trust verification using natural language processing and machine learning

Z Shahbazi, YC Byun - IEEE Access, 2021 - ieeexplore.ieee.org
Information sharing is one of the huge topics in social media platform regarding the daily
news related to events or disasters happens in nature or its human-made. The automatic …

You only hear once: a YOLO-like algorithm for audio segmentation and sound event detection

S Venkatesh, D Moffat, ER Miranda - Applied Sciences, 2022 - mdpi.com
Audio segmentation and sound event detection are crucial topics in machine listening that
aim to detect acoustic classes and their respective boundaries. It is useful for audio-content …

Training sound event detection with soft labels from crowdsourced annotations

I Martín-Morató, M Harju, P Ahokas… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
In this paper, we study the use of soft labels to train a system for sound event detection
(SED). Soft labels can result from annotations which account for human uncertainty about …

A mutual learning framework for few-shot sound event detection

D Yang, H Wang, Y Zou, Z Ye… - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org
Although prototypical network (ProtoNet) has proved to be an effective method for few-shot
sound event detection, two problems still exist. Firstly, the small-scaled support set is …