- Academic Search

MC Schiappa, YS Rawat, M Shah - ACM Computing Surveys, 2023 - dl.acm.org

The remarkable success of deep learning in various domains relies on the availability of
large-scale annotated datasets. However, obtaining annotations is expensive and requires …

Enregistrer Citer Cité 152 fois Autres articles Les 4 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Sound event detection: A tutorial

A Mesaros, T Heittola, T Virtanen… - IEEE Signal …, 2021 - ieeexplore.ieee.org

Imagine standing on a street corner in the city. With your eyes closed you can hear and
recognize a succession of sounds: cars passing by, people speaking, their footsteps when …

Enregistrer Citer Cité 271 fois Autres articles Les 9 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Wav2clip: Learning robust audio representations from clip

HH Wu, P Seetharaman, K Kumar… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

We propose Wav2CLIP, a robust audio representation learning method by distilling from
Contrastive Language-Image Pre-training (CLIP). We systematically evaluate Wav2CLIP on …

Enregistrer Citer Cité 281 fois Autres articles Les 9 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Listen, think, and understand

Y Gong, H Luo, AH Liu, L Karlinsky, J Glass - arxiv preprint arxiv …, 2023 - arxiv.org

The ability of artificial intelligence (AI) systems to perceive and comprehend audio signals is
crucial for many applications. Although significant progress has been made in this area …

Enregistrer Citer Cité 145 fois Autres articles Les 6 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Byol for audio: Self-supervised learning for general-purpose audio representation

D Niizumi, D Takeuchi, Y Ohishi… - … Joint Conference on …, 2021 - ieeexplore.ieee.org

Inspired by the recent progress in self-supervised learning for computer vision that
generates supervision using data augmentations, we explore a new general-purpose audio …

Enregistrer Citer Cité 196 fois Autres articles Les 5 versions Free GPT-4

[Free GPT-4]

[PDF] mlr.press

Voice2series: Reprogramming acoustic models for time series classification

CHH Yang, YY Tsai, PY Chen - International conference on …, 2021 - proceedings.mlr.press

Learning to classify time series with limited data is a practical yet challenging problem.
Current methods are primarily based on hand-designed feature extraction rules or domain …

Enregistrer Citer Cité 153 fois Autres articles Les 7 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] mlr.press

Masked spectrogram modeling using masked autoencoders for learning general-purpose audio representation

D Niizumi, D Takeuchi, Y Ohishi… - … Evaluation of Audio …, 2022 - proceedings.mlr.press

Recent general-purpose audio representations show state-of-the-art performance on
various audio tasks. These representations are pre-trained by self-supervised learning …

Enregistrer Citer Cité 62 fois Autres articles Les 5 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Contrastive learning of musical representations

J Spijkervet, JA Burgoyne - arxiv preprint arxiv:2103.09410, 2021 - arxiv.org

While deep learning has enabled great advances in many areas of music, labeled music
datasets remain especially hard, expensive, and time-consuming to create. In this work, we …

Enregistrer Citer Cité 143 fois Autres articles Les 5 versions Free GPT-4 Version HTML

[Free GPT-4]

[HTML] sciencedirect.com

[HTML][HTML] Detection of COVID-19 from voice, cough and breathing patterns: Dataset and preliminary results

V Despotovic, M Ismael, M Cornil, R Mc Call… - Computers in Biology …, 2021 - Elsevier

COVID-19 heavily affects breathing and voice and causes symptoms that make patients'
voices distinctive, creating recognizable audio signatures. Initial studies have already …

Enregistrer Citer Cité 84 fois Autres articles Les 5 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Stable audio open

Z Evans, JD Parker, CJ Carr, Z Zukowski… - arxiv preprint arxiv …, 2024 - arxiv.org

Open generative models are vitally important for the community, allowing for fine-tunes and
serving as baselines when presenting new models. However, most current text-to-audio …

Enregistrer Citer Cité 30 fois Autres articles Version HTML

Créer l'alerte

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

Look, listen, and learn more: Design choices for deep audio embeddings

Self-supervised learning for videos: A survey

Sound event detection: A tutorial

Wav2clip: Learning robust audio representations from clip

Listen, think, and understand

Byol for audio: Self-supervised learning for general-purpose audio representation

Voice2series: Reprogramming acoustic models for time series classification

Masked spectrogram modeling using masked autoencoders for learning general-purpose audio representation

Contrastive learning of musical representations

[HTML][HTML] Detection of COVID-19 from voice, cough and breathing patterns: Dataset and preliminary results

Stable audio open