Explainable and interpretable multimodal large language models: A comprehensive survey

Y Dang, K Huang, J Huo, Y Yan, S Huang, D Liu… - arxiv preprint arxiv …, 2024 - arxiv.org
The rapid development of Artificial Intelligence (AI) has revolutionized numerous fields, with
large language models (LLMs) and computer vision (CV) systems driving advancements in …

Detecting music deepfakes is easy but actually hard

D Afchar, G Meseguer-Brocal, R Hennequin - arxiv preprint arxiv …, 2024 - arxiv.org
In the face of a new era of generative models, the detection of artificially generated content
has become a matter of utmost importance. The ability to create credible minute-long music …

Tackling interpretability in audio classification networks with non-negative matrix factorization

J Parekh, S Parekh, P Mozharovskyi… - … on Audio, Speech …, 2024 - ieeexplore.ieee.org
This article tackles two major problem settings for interpretability of audio processing
networks, post-hoc and by-design interpretation. For post-hoc interpretation, we aim to …

Improved symbolic drum style classification with grammar-based hierarchical representations

L Géré, P Rigaux, N Audebert - arxiv preprint arxiv:2407.17536, 2024 - arxiv.org
Deep learning models have become a critical tool for analysis and classification of musical
data. These models operate either on the audio signal, eg waveform or spectrogram, or on a …

SMUG-Explain: A Framework for Symbolic Music Graph Explanations

E Karystinaios, F Foscarin, G Widmer - arxiv preprint arxiv:2405.09241, 2024 - arxiv.org
In this work, we present Score MUsic Graph (SMUG)-Explain, a framework for generating
and visualizing explanations of graph neural networks applied to arbitrary prediction tasks …

Musiclime: Explainable multimodal music understanding

T Sotirou, V Lyberatos, OM Mastromichalakis… - arxiv preprint arxiv …, 2024 - arxiv.org
Multimodal models are critical for music understanding tasks, as they capture the complex
interplay between audio and lyrics. However, as these models become more prevalent, the …

Predicting Music Hierarchies with a Graph-Based Neural Decoder

F Foscarin, D Harasim, G Widmer - arxiv preprint arxiv:2306.16955, 2023 - arxiv.org
This paper describes a data-driven framework to parse musical sequences into dependency
trees, which are hierarchical structures used in music cognition research and music …

Interpretable music recommender systems

D Afchar - 2023 - theses.hal.science
''Why do they keep recommending me this music track?''''Why did our system recommend
these tracks to users?''Nowadays, streaming platforms are the most common way to listen to …

PBSCR: The Piano Bootleg Score Composer Recognition Dataset

A Jain, A Bunn, A Pham, TJ Tsai - arxiv preprint arxiv:2401.16803, 2024 - arxiv.org
This article motivates, describes, and presents the PBSCR dataset for studying composer
recognition of classical piano music. Our goal was to design a dataset that facilitates large …

[PDF][PDF] REAL-TIME FUTURE-RHYTHM VISUALIZER FOR DJ PERFORMANCE

M HAMANAKA - … on immersion through laser doppler vibrometry, 2024 - research.ed.ac.uk
In breaking, which features two dancers breakdancing against each other to the rhythm of
music remixed by a disc jockey (DJ), it is important to match the technique to the timing of the …