Survey of deep learning paradigms for speech processing
KB Bhangale, M Kothandaraman - Wireless Personal Communications, 2022 - Springer
Over the past decades, a particular focus is given to research on machine learning
techniques for speech processing applications. However, in the past few years, research …
techniques for speech processing applications. However, in the past few years, research …
[HTML][HTML] Music emotion recognition using convolutional long short term memory deep neural networks
In this paper, we propose an approach for music emotion recognition based on
convolutional long short term memory deep neural network (CLDNN) architecture. In …
convolutional long short term memory deep neural network (CLDNN) architecture. In …
Multiclass audio segmentation based on recurrent neural networks for broadcast domain data
This paper presents a new approach based on recurrent neural networks (RNN) to the
multiclass audio segmentation task whose goal is to classify an audio signal as speech …
multiclass audio segmentation task whose goal is to classify an audio signal as speech …
A multi-resolution CRNN-based approach for semi-supervised sound event detection in DCASE 2020 challenge
Sound Event Detection is a task with a rising relevance over the recent years in the field of
audio signal processing, due to the creation of specific datasets such as Google AudioSet or …
audio signal processing, due to the creation of specific datasets such as Google AudioSet or …
Ethio-Semitic language identification using convolutional neural networks with data augmentation
In today's digital world, natural language is used to exchange information among humans,
and it has now advanced to the point of being an evolution criteria for technology. The …
and it has now advanced to the point of being an evolution criteria for technology. The …
Towards cross-modal pre-training and learning tempo-spatial characteristics for audio recognition with convolutional and recurrent neural networks
In this paper, we investigate the performance of two deep learning paradigms for the audio-
based tasks of acoustic scene, environmental sound and domestic activity classification. In …
based tasks of acoustic scene, environmental sound and domestic activity classification. In …
End-to-end sequence labeling via convolutional recurrent neural network with a connectionist temporal classification layer
X Huang, L Qiao, W Yu, J Li, Y Ma - International Journal of Computational …, 2020 - Springer
Sequence labeling is a common machine-learning task which not only needs the most likely
prediction of label for a local input but also seeks the most suitable annotation for the whole …
prediction of label for a local input but also seeks the most suitable annotation for the whole …
A large TV dataset for speech and music activity detection
Automatic speech and music activity detection (SMAD) is an enabling task that can help
segment, index, and pre-process audio content in radio broadcast and TV programs …
segment, index, and pre-process audio content in radio broadcast and TV programs …
Semi-supervised machine condition monitoring by learning deep discriminative audio features
In this study, we aim to learn highly descriptive representations for a wide set of machinery
sounds and exploit this knowledge to perform condition monitoring of mechanical …
sounds and exploit this knowledge to perform condition monitoring of mechanical …
Music emotion recognition based on a modified brain emotional learning model
Listening to music can evoke different emotions in humans. Music emotion recognition
(MER) can predict a person's emotions before listening to a song. However, there are three …
(MER) can predict a person's emotions before listening to a song. However, there are three …