RQNet: Residual quaternion CNN for performance enhancement in low complexity and device robust acoustic scene classification

A Madhu, K Suresh - IEEE Transactions on Multimedia, 2023 - ieeexplore.ieee.org
Acoustic Scene Classification aims to recognize the unique acoustic characteristics of an
environment. Recently, Convolutional Neural Networks (CNNs) have boosted the accuracy …

[PDF][PDF] HYU Submission for The Dcase 2022: Fine-tuning method using device-aware data-random-drop for device-imbalanced acoustic scene classification

JH Lee, JH Choi, PM Byun… - Detection Classif. Acoust …, 2022 - dcase.community
This paper address the Hanyang University team submission for the DCASE 2022
Challenge Low-Complexity Acoustic Scene Classification task. The task aims to design a …

Dummy prototypical networks for few-shot open-set keyword spotting

B Kim, S Yang, I Chung, S Chang - ar** a lightweight speaker embedding extractor (SEE) is crucial for the practical
implementation of automatic speaker verification (ASV) systems. To this end, we recently …

Randmasking augment: A simple and randomized data augmentation for acoustic scene classification

J Han, M Matuszewski, O Sikorski… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
In this work, we describe RandMasking Augment as an effective data augmentation method
for acoustic scene classification research. We concentrate on both time and frequency …

[PDF][PDF] Multi-Scale Architecture and Device-Aware Data-Random-Drop Based Fine-Tuning Method for Acoustic Scene Classification.

JH Lee, JH Choi, PM Byun, JH Chang - DCASE, 2022 - dcase.community
We propose a low-complexity acoustic scene classification (ASC) model structure suitable
for short-segmented audio and fine-tuning methods for generalization to multiple recording …

Domain agnostic few-shot learning for speaker verification

S Yang, D Das, J Cho, H Park, S Yun - arxiv preprint arxiv:2206.13700, 2022 - arxiv.org
Deep learning models for verification systems often fail to generalize to new users and new
environments, even though they learn highly discriminative features. To address this …

Synthetic data generation techniques for training deep acoustic siren identification networks

S Damiano, B Cramer, A Guntoro… - Frontiers in Signal …, 2024 - frontiersin.org
Acoustic sensing has been widely exploited for the early detection of harmful situations in
urban environments: in particular, several siren identification algorithms based on deep …

What is Learnt by the LEArnable Front-end (LEAF)? Adapting Per-Channel Energy Normalisation (PCEN) to Noisy Conditions

H Meng, V Sethu, E Ambikairajah - arxiv preprint arxiv:2404.06702, 2024 - arxiv.org
There is increasing interest in the use of the LEArnable Front-end (LEAF) in a variety of
speech processing systems. However, there is a dearth of analyses of what is actually learnt …

Progressive unsupervised domain adaptation for asr using ensemble models and multi-stage training

R Ahmad, MU Farooq, T Hain - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org
In Automatic Speech Recognition (ASR), teacher-student (T/S) training has shown to
perform well for domain adaptation with small amount of training data. However, adaption …