[HTML][HTML] Advancements in preprocessing, detection and classification techniques for ecoacoustic data: A comprehensive review for large-scale Passive Acoustic …

T Napier, E Ahn, S Allen-Ankins, L Schwarzkopf… - Expert Systems with …, 2024 - Elsevier
Computational ecoacoustics has seen significant growth in recent decades, facilitated by the
reduced costs of digital sound recording devices and data storage. This progress has …

Music controlnet: Multiple time-varying controls for music generation

SL Wu, C Donahue, S Watanabe… - IEEE/ACM Transactions …, 2024 - ieeexplore.ieee.org
Text-to-music generation models are now capable of generating high-quality music audio in
broad styles. However, text control is primarily suitable for the manipulation of global musical …

Pyannote. audio: neural building blocks for speaker diarization

H Bredin, R Yin, JM Coria, G Gelly… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org
We introduce pyannote. audio, an open-source toolkit written in Python for speaker
diarization. Based on PyTorch machine learning framework, it provides a set of trainable end …

Audio2gestures: Generating diverse gestures from speech audio with conditional variational autoencoders

J Li, D Kang, W Pei, X Zhe, Y Zhang… - Proceedings of the …, 2021 - openaccess.thecvf.com
Generating conversational gestures from speech audio is challenging due to the inherent
one-to-many map** between audio and body motions. Conventional CNNs/RNNs …

Hybrid LSTM-transformer model for emotion recognition from speech audio files

F Andayani, LB Theng, MT Tsun, C Chua - IEEE Access, 2022 - ieeexplore.ieee.org
Emotion is a vital component in daily human communication and it helps people understand
each other. Emotion recognition plays a crucial role in develo** human-computer …

An emotion-based personalized music recommendation framework for emotion improvement

Z Liu, W Xu, W Zhang, Q Jiang - Information Processing & Management, 2023 - Elsevier
Music has a close relationship with people's emotion and mental status. Music
recommendation has both economic and social benefits. Unfortunately, most existing music …

A multimodal hierarchical approach to speech emotion recognition from audio and text

P Singh, R Srivastava, KPS Rana, V Kumar - Knowledge-Based Systems, 2021 - Elsevier
Speech emotion recognition (SER) plays a crucial role in improving the quality of man–
machine interfaces in various fields like distance learning, medical science, virtual …

Automated dysarthria severity classification: A study on acoustic features and deep learning techniques

AA Joshy, R Rajan - IEEE Transactions on Neural Systems and …, 2022 - ieeexplore.ieee.org
Assessing the severity level of dysarthria can provide an insight into the patient's
improvement, assist pathologists to plan therapy, and aid automatic dysarthric speech …

Coswara: A respiratory sounds and symptoms dataset for remote screening of SARS-CoV-2 infection

D Bhattacharya, NK Sharma, D Dutta, SR Chetupalli… - Scientific data, 2023 - nature.com
This paper presents the Coswara dataset, a dataset containing diverse set of respiratory
sounds and rich meta-data, recorded between April-2020 and February-2022 from 2635 …

Robustness of musical features on deep learning models for music genre classification

Y Singh, A Biswas - Expert Systems with Applications, 2022 - Elsevier
Music information retrieval (MIR) has witnessed rapid advances in various tasks like musical
similarity, music genre classification (MGC), etc. MGC and audio tagging are approached …