Visual sound localization in the wild by cross-modal interference erasing
The task of audiovisual sound source localization has been well studied under constrained
scenes, where the audio recordings are clean. However, in real world scenarios, audios are …
scenes, where the audio recordings are clean. However, in real world scenarios, audios are …
Hybrid Transformer Architectures with Diverse Audio Features for Deepfake Speech Classification
The rise of synthetic speech technologies has triggered growing concerns about the
increasing difficulty in distinguishing between real and fake voices. In this context, we …
increasing difficulty in distinguishing between real and fake voices. In this context, we …
Pattern analysis based acoustic signal processing: a survey of the state-of-art
J Chaki - International Journal of Speech Technology, 2021 - Springer
Audio signal processing is the most challenging field in the current era for an analysis of an
audio signal. Audio signal classification (ASC) comprises of generating appropriate features …
audio signal. Audio signal classification (ASC) comprises of generating appropriate features …
Bag-of-features methods for acoustic event detection and classification
The detection and classification of acoustic events in various environments is an important
task. Its applications range from multimedia analysis to surveillance of humans or even …
task. Its applications range from multimedia analysis to surveillance of humans or even …
Adaptive multi-scale detection of acoustic events
The goal of acoustic (or sound) events detection (AED or SED) is to predict the temporal
position of target events in given audio segments. This task plays a significant role in safety …
position of target events in given audio segments. This task plays a significant role in safety …
Musicyolo: A vision-based framework for automatic singing transcription
X Wang, B Tian, W Yang, W Xu… - IEEE/ACM Transactions …, 2022 - ieeexplore.ieee.org
Automatic singing transcription (AST), which refers to the process of inferring the onset,
offset, and pitch from the singing audio, is of great significance in music information retrieval …
offset, and pitch from the singing audio, is of great significance in music information retrieval …
Detection and classification of human-produced nonverbal audio events
Audio wearable devices, or hearables, are becoming an increasingly popular consumer
product. Some of these hearables contain an in-ear microphone to capture audio signals …
product. Some of these hearables contain an in-ear microphone to capture audio signals …
Unifying isolated and overlap** audio event detection with multi-label multi-task convolutional recurrent neural networks
We propose a multi-label multi-task framework based on a convolutional recurrent neural
network to unify detection of isolated and overlap** audio events. The framework …
network to unify detection of isolated and overlap** audio events. The framework …
Recognition of breathing activity and medication adherence using LSTM neural networks
Obstructive inflammatory pulmonary diseases are life-long conditions of the airways
affecting millions worldwide. A crucial step towards effective self-management is the …
affecting millions worldwide. A crucial step towards effective self-management is the …
Deep learning based sound event detection and classification
A Nasiri - 2021 - search.proquest.com
Hearing sense has an important role in our daily lives. During the recent years, there has
been many studies to transfer this capability to the computers. In this dissertation, we design …
been many studies to transfer this capability to the computers. In this dissertation, we design …