Multimodal pretraining, adaptation, and generation for recommendation: A survey

Q Liu, J Zhu, Y Yang, Q Dai, Z Du, XM Wu… - Proceedings of the 30th …, 2024 - dl.acm.org
Personalized recommendation serves as a ubiquitous channel for users to discover
information tailored to their interests. However, traditional recommendation models primarily …

Marble: Music audio representation benchmark for universal evaluation

R Yuan, Y Ma, Y Li, G Zhang, X Chen… - Advances in …, 2023 - proceedings.neurips.cc
In the era of extensive intersection between art and Artificial Intelligence (AI), such as image
generation and fiction co-creation, AI for music remains relatively nascent, particularly in …

End-to-end modeling via information tree for one-shot natural language spatial video grounding

M Li, T Wang, H Zhang, S Zhang, Z Zhao… - arxiv preprint arxiv …, 2022 - arxiv.org
Natural language spatial video grounding aims to detect the relevant objects in video frames
with descriptive sentences as the query. In spite of the great advances, most existing …

On the effectiveness of speech self-supervised learning for music

Y Ma, R Yuan, Y Li, G Zhang, X Chen, H Yin… - arxiv preprint arxiv …, 2023 - arxiv.org
Self-supervised learning (SSL) has shown promising results in various speech and natural
language processing applications. However, its efficacy in music information retrieval (MIR) …

Contrastive balancing representation learning for heterogeneous dose-response curves estimation

M Zhu, A Wu, H Li, R **ong, B Li, X Yang… - Proceedings of the …, 2024 - ojs.aaai.org
Estimating the individuals' potential response to varying treatment doses is crucial for
decision-making in areas such as precision medicine and management science. Most …

Discover: Disentangled music representation learning for cover song identification

J Xun, S Zhang, Y Yang, J Zhu, L Deng… - Proceedings of the 46th …, 2023 - dl.acm.org
In the field of music information retrieval (MIR), cover song identification (CSI) is a
challenging task that aims to identify cover versions of a query song from a massive …

On the effect of data-augmentation on local embedding properties in the contrastive learning of music audio representations

MC McCallum, MEP Davies, F Henkel… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
Audio embeddings are crucial tools in understanding large catalogs of music. Typically
embeddings are evaluated on the basis of the performance they provide in a wide range of …

Pre-training strategies using contrastive learning and playlist information for music classification and similarity

P Alonso-Jiménez, X Favory… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
In this work, we investigate an approach that relies on contrastive learning and music
metadata as a weak source of supervision to train music representation models. Recent …

Multimodal Pretraining and Generation for Recommendation: A Tutorial

J Zhu, X Zhou, C Wu, R Zhang, Z Dong - … of the ACM on Web Conference …, 2024 - dl.acm.org
Personalized recommendation stands as a ubiquitous channel for users to explore
information or items aligned with their interests. Nevertheless, prevailing recommendation …

Equivariant self-supervision for musical tempo estimation

E Quinton - arxiv preprint arxiv:2209.01478, 2022 - arxiv.org
Self-supervised methods have emerged as a promising avenue for representation learning
in the recent years since they alleviate the need for labeled datasets, which are scarce and …