[HTML][HTML] Environmental Sound Classification: A descriptive review of the literature

A Bansal, NK Garg - Intelligent systems with applications, 2022 - Elsevier
Automatic environmental sound classification (ESC) is one of the upcoming areas of
research as most of the traditional studies are focused on speech and music signals …

Acoustic scene classification: A comprehensive survey

B Ding, T Zhang, C Wang, G Liu, J Liang, R Hu… - Expert Systems with …, 2024 - Elsevier
Acoustic scene classification (ASC) has gained significant interest recently due to its diverse
applications. Various audio signal processing and machine learning methods have been …

Vatt: Transformers for multimodal self-supervised learning from raw video, audio and text

H Akbari, L Yuan, R Qian… - Advances in neural …, 2021 - proceedings.neurips.cc
We present a framework for learning multimodal representations from unlabeled data using
convolution-free Transformer architectures. Specifically, our Video-Audio-Text Transformer …

An attention-based deep learning approach for sleep stage classification with single-channel EEG

E Eldele, Z Chen, C Liu, M Wu… - … on Neural Systems …, 2021 - ieeexplore.ieee.org
Automatic sleep stage mymargin classification is of great importance to measure sleep
quality. In this paper, we propose a novel attention-based deep learning architecture called …

Earthquake transformer—an attentive deep-learning model for simultaneous earthquake detection and phase picking

SM Mousavi, WL Ellsworth, W Zhu, LY Chuang… - Nature …, 2020 - nature.com
Earthquake signal detection and seismic phase picking are challenging tasks in the
processing of noisy data and the monitoring of microearthquakes. Here we present a global …

Panns: Large-scale pretrained audio neural networks for audio pattern recognition

Q Kong, Y Cao, T Iqbal, Y Wang… - … on Audio, Speech …, 2020 - ieeexplore.ieee.org
Audio pattern recognition is an important research topic in the machine learning area, and
includes several tasks such as audio tagging, acoustic scene classification, music …

End-to-end environmental sound classification using a 1D convolutional neural network

S Abdoli, P Cardinal, AL Koerich - Expert Systems with Applications, 2019 - Elsevier
In this paper, we present an end-to-end approach for environmental sound classification
based on a 1D Convolution Neural Network (CNN) that learns a representation directly from …

Realistic speech-driven facial animation with gans

K Vougioukas, S Petridis, M Pantic - International Journal of Computer …, 2020 - Springer
Speech-driven facial animation is the process that automatically synthesizes talking
characters based on speech signals. The majority of work in this domain creates a map** …

Looking beyond {GPUs} for {DNN} scheduling on {Multi-Tenant} clusters

J Mohan, A Phanishayee, J Kulkarni… - … USENIX Symposium on …, 2022 - usenix.org
Training Deep Neural Networks (DNNs) is a popular workload in both enterprises and cloud
data centers. Existing schedulers for DNN training consider GPU as the dominant resource …

End-to-end speech emotion recognition using deep neural networks

P Tzirakis, J Zhang, BW Schuller - 2018 IEEE international …, 2018 - ieeexplore.ieee.org
Affect recognition is an important component towards the better interaction between human
and machines. Applications of emotion recognition in speech can be found in several areas …