Multimodal emotion recognition on RAVDESS dataset using transfer learning
Emotion Recognition is attracting the attention of the research community due to the multiple
areas where it can be applied, such as in healthcare or in road safety systems. In this paper …
areas where it can be applied, such as in healthcare or in road safety systems. In this paper …
MUGen: Multi-modal Music Understanding and Generation with the Power of Large Language Models
The current landscape of research leveraging large language models (LLMs) is
experiencing a surge. Many works harness the powerful reasoning capabilities of these …
experiencing a surge. Many works harness the powerful reasoning capabilities of these …
Strong labeling of sound events using crowdsourced weak labels and annotator competence estimation
Crowdsourcing is a popular tool for collecting large amounts of annotated data, but the
specific format of the strong labels necessary for sound event detection is not easily …
specific format of the strong labels necessary for sound event detection is not easily …
A comprehensive survey of automated audio captioning
Automated audio captioning, a task that mimics human perception as well as innovatively
links audio processing and natural language processing, has overseen much progress over …
links audio processing and natural language processing, has overseen much progress over …
Improving crisis events detection using distilbert with hunger games search algorithm
This paper presents an alternative event detection model based on the integration between
the DistilBERT and a new meta-heuristic technique named the Hunger Games Search …
the DistilBERT and a new meta-heuristic technique named the Hunger Games Search …
Voice activity detection in the wild: A data-driven approach using teacher-student training
Voice activity detection is an essential pre-processing component for speech-related tasks
such as automatic speech recognition (ASR). Traditional supervised VAD systems obtain …
such as automatic speech recognition (ASR). Traditional supervised VAD systems obtain …
Blockchain-based event detection and trust verification using natural language processing and machine learning
Information sharing is one of the huge topics in social media platform regarding the daily
news related to events or disasters happens in nature or its human-made. The automatic …
news related to events or disasters happens in nature or its human-made. The automatic …
You only hear once: a YOLO-like algorithm for audio segmentation and sound event detection
Audio segmentation and sound event detection are crucial topics in machine listening that
aim to detect acoustic classes and their respective boundaries. It is useful for audio-content …
aim to detect acoustic classes and their respective boundaries. It is useful for audio-content …
Training sound event detection with soft labels from crowdsourced annotations
In this paper, we study the use of soft labels to train a system for sound event detection
(SED). Soft labels can result from annotations which account for human uncertainty about …
(SED). Soft labels can result from annotations which account for human uncertainty about …
A mutual learning framework for few-shot sound event detection
Although prototypical network (ProtoNet) has proved to be an effective method for few-shot
sound event detection, two problems still exist. Firstly, the small-scaled support set is …
sound event detection, two problems still exist. Firstly, the small-scaled support set is …