- Academic Search

D Busbridge, J Ramapuram, P Ablin… - Advances in …, 2024 - proceedings.neurips.cc

Preserving training dynamics across batch sizes is an important tool for practical machine
learning as it enables the trade-off between batch size and wall-clock time. This trade-off is …

Save Cite Cited by 13 Related articles All 5 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Dexterity from touch: Self-supervised pre-training of tactile representations with robotic play

I Guzey, B Evans, S Chintala, L Pinto - arxiv preprint arxiv:2303.12076, 2023 - arxiv.org

Teaching dexterity to multi-fingered robots has been a longstanding challenge in robotics.
Most prominent work in this area focuses on learning controllers or policies that either …

Save Cite Cited by 52 Related articles All 4 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Masked modeling duo: Learning representations by encouraging both networks to model the input

D Niizumi, D Takeuchi, Y Ohishi… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org

Masked Autoencoders is a simple yet powerful self-supervised learning method. However, it
learns representations indirectly by reconstructing masked input patches. Several methods …

Save Cite Cited by 32 Related articles All 5 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Self-supervised audio teacher-student transformer for both clip-level and frame-level tasks

X Li, N Shao, X Li - IEEE/ACM Transactions on Audio, Speech …, 2024 - ieeexplore.ieee.org

Self-supervised learning (SSL) has emerged as a popular approach for learning audio
representations. One goal of audio self-supervised pre-training is to transfer knowledge to …

Save Cite Cited by 25 Related articles All 4 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Self-supervised learning for speech enhancement through synthesis

B Irvin, M Stamenovic, M Kegler… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org

Modern speech enhancement (SE) networks typically implement noise suppression through
time-frequency masking, latent representation masking, or discriminative signal prediction …

Save Cite Cited by 20 Related articles All 4 versions Free GPT-4

[Free GPT-4]

[PDF] aaai.org

Xkd: Cross-modal knowledge distillation with domain alignment for video representation learning

P Sarkar, A Etemad - Proceedings of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org

We present XKD, a novel self-supervised framework to learn meaningful representations
from unlabelled videos. XKD is trained with two pseudo objectives. First, masked data …

Save Cite Cited by 20 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Self-supervised learning for anomalous sound detection

K Wilkinghoff - … 2024-2024 IEEE International Conference on …, 2024 - ieeexplore.ieee.org

State-of-the-art anomalous sound detection (ASD) systems are often trained by using an
auxiliary classification task to learn an embedding space. Doing so enables the system to …

Save Cite Cited by 17 Related articles All 3 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

On the effect of data-augmentation on local embedding properties in the contrastive learning of music audio representations

MC McCallum, MEP Davies, F Henkel… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org

Audio embeddings are crucial tools in understanding large catalogs of music. Typically
embeddings are evaluated on the basis of the performance they provide in a wide range of …

Save Cite Cited by 8 Related articles All 3 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Self-supervised learning for few-shot bird sound classification

I Moummad, N Farrugia… - 2024 IEEE International …, 2024 - ieeexplore.ieee.org

Self-supervised learning (SSL) in audio holds significant potential across various domains,
particularly in situations where abundant, unlabeled data is readily available at no cost. This …

Save Cite Cited by 7 Related articles All 2 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Benchmarking Representations for Speech, Music, and Acoustic Events

M La Quatra, A Koudounas, L Vaiani, E Baralis… - arxiv preprint arxiv …, 2024 - arxiv.org

Limited diversity in standardized benchmarks for evaluating audio representation learning
(ARL) methods may hinder systematic comparison of current methods' capabilities. We …

Save Cite Cited by 10 Related articles All 2 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

BYOL for audio: Exploring pre-trained general-purpose audio representations

How to scale your ema

Dexterity from touch: Self-supervised pre-training of tactile representations with robotic play

Masked modeling duo: Learning representations by encouraging both networks to model the input

Self-supervised audio teacher-student transformer for both clip-level and frame-level tasks

Self-supervised learning for speech enhancement through synthesis

Xkd: Cross-modal knowledge distillation with domain alignment for video representation learning

Self-supervised learning for anomalous sound detection

On the effect of data-augmentation on local embedding properties in the contrastive learning of music audio representations

Self-supervised learning for few-shot bird sound classification

Benchmarking Representations for Speech, Music, and Acoustic Events