A survey of audio classification using deep learning

K Zaman, M Sah, C Direkoglu, M Unoki - IEEE Access, 2023 - ieeexplore.ieee.org
Deep learning can be used for audio signal classification in a variety of ways. It can be used
to detect and classify various types of audio signals such as speech, music, and …

A hybrid feature-extracted deep CNN with reduced parameters substitutes an End-to-End CNN for the recognition of spoken Bengali digits

B Paul, S Phadikar - Multimedia Tools and Applications, 2024 - Springer
Speech Recognition (SR) is an emerging field in the native language nowadays.
Recognizing isolated words in the local language helps people use smartphones and …

A variational Bayesian approach to learning latent variables for acoustic knowledge transfer

H Hu, SM Siniscalchi, CHH Yang… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
We propose a variational Bayesian (VB) approach to learning distributions of latent
variables in deep neural network (DNN) models for cross-domain knowledge transfer, to …

Speecheq: Speech emotion recognition based on multi-scale unified datasets and multitask learning

Z Kang, J Peng, J Wang, J **ao - arxiv preprint arxiv:2206.13101, 2022 - arxiv.org
Speech emotion recognition (SER) has many challenges, but one of the main challenges is
that each framework does not have a unified standard. In this paper, we propose SpeechEQ …

Boosting StarGANs for voice conversion with contrastive discriminator

S Si, J Wang, X Zhang, X Qu, N Cheng… - … Conference on Neural …, 2022 - Springer
Nonparallel multi-domain voice conversion methods such as the StarGAN-VCs have been
widely applied in many scenarios. However, the training of these models usually poses a …

Information Bottleneck-Based Domain Adaptation for Hybrid Deep Learning in Scalable Network Slicing

T Hu, Q Liao, Q Liu, G Carle - IEEE Transactions on Machine …, 2024 - ieeexplore.ieee.org
Network slicing enables operators to efficiently support diverse applications on a shared
infrastructure. However, the evolving complexity of networks, compounded by inter-cell …

Instance-level loss based multiple-instance learning for acoustic scene classification

WG Choi, JH Chang, JM Yang, HG Moon - 2022 - osf.io
In acoustic scene classification (ASC) task, an acoustic scene consists of diverse attributes
and is inferred by identifying combinations of some distinct attributes among them. This …

Variational Bayesian Adaptive Learning of Deep Latent Variables for Acoustic Knowledge Transfer

H Hu, SM Siniscalchi, CHH Yang… - IEEE Transactions on …, 2025 - ieeexplore.ieee.org
In this work, we propose a novel variational Bayesian adaptive learning approach for cross-
domain knowledge transfer to address acoustic mismatches between training and testing …

Augmentation-induced consistency regularization for classification

J Wu, S Si, J Wang, J **ao - 2022 International Joint …, 2022 - ieeexplore.ieee.org
Deep neural networks have become popular in many supervised learning tasks, but they
may suffer from overfitting when the training dataset is limited. To mitigate this, many …

Uncertainty Calibration for Deep Audio Classifiers

T Ye, S Si, J Wang, N Cheng, J **ao - arxiv preprint arxiv:2206.13071, 2022 - arxiv.org
Although deep Neural Networks (DNNs) have achieved tremendous success in audio
classification tasks, their uncertainty calibration are still under-explored. A well-calibrated …