Visual sound localization in the wild by cross-modal interference erasing

X Liu, R Qian, H Zhou, D Hu, W Lin, Z Liu… - Proceedings of the …, 2022 - ojs.aaai.org
The task of audiovisual sound source localization has been well studied under constrained
scenes, where the audio recordings are clean. However, in real world scenarios, audios are …

Hybrid Transformer Architectures with Diverse Audio Features for Deepfake Speech Classification

K Zaman, IJAM Samiul, M Sah, C Direkoglu… - IEEE …, 2024 - ieeexplore.ieee.org
The rise of synthetic speech technologies has triggered growing concerns about the
increasing difficulty in distinguishing between real and fake voices. In this context, we …

Pattern analysis based acoustic signal processing: a survey of the state-of-art

J Chaki - International Journal of Speech Technology, 2021 - Springer
Audio signal processing is the most challenging field in the current era for an analysis of an
audio signal. Audio signal classification (ASC) comprises of generating appropriate features …

Bag-of-features methods for acoustic event detection and classification

R Grzeszick, A Plinge, GA Fink - IEEE/ACM Transactions on …, 2017 - ieeexplore.ieee.org
The detection and classification of acoustic events in various environments is an important
task. Its applications range from multimedia analysis to surveillance of humans or even …

Adaptive multi-scale detection of acoustic events

W Ding, L He - IEEE/ACM Transactions on Audio, Speech, and …, 2019 - ieeexplore.ieee.org
The goal of acoustic (or sound) events detection (AED or SED) is to predict the temporal
position of target events in given audio segments. This task plays a significant role in safety …

Musicyolo: A vision-based framework for automatic singing transcription

X Wang, B Tian, W Yang, W Xu… - IEEE/ACM Transactions …, 2022 - ieeexplore.ieee.org
Automatic singing transcription (AST), which refers to the process of inferring the onset,
offset, and pitch from the singing audio, is of great significance in music information retrieval …

Detection and classification of human-produced nonverbal audio events

P Chabot, RE Bouserhal, P Cardinal, J Voix - Applied Acoustics, 2021 - Elsevier
Audio wearable devices, or hearables, are becoming an increasingly popular consumer
product. Some of these hearables contain an in-ear microphone to capture audio signals …

Unifying isolated and overlap** audio event detection with multi-label multi-task convolutional recurrent neural networks

H Phan, OY Chén, P Koch, L Pham… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org
We propose a multi-label multi-task framework based on a convolutional recurrent neural
network to unify detection of isolated and overlap** audio events. The framework …

Recognition of breathing activity and medication adherence using LSTM neural networks

D Pettas, S Nousias, EI Zacharaki… - 2019 IEEE 19th …, 2019 - ieeexplore.ieee.org
Obstructive inflammatory pulmonary diseases are life-long conditions of the airways
affecting millions worldwide. A crucial step towards effective self-management is the …

Deep learning based sound event detection and classification

A Nasiri - 2021 - search.proquest.com
Hearing sense has an important role in our daily lives. During the recent years, there has
been many studies to transfer this capability to the computers. In this dissertation, we design …