Wavchat: A survey of spoken dialogue models
Recent advancements in spoken dialogue models, exemplified by systems like GPT-4o,
have captured significant attention in the speech domain. Compared to traditional three-tier …
have captured significant attention in the speech domain. Compared to traditional three-tier …
Omnibench: Towards the future of universal omni-language models
Recent advancements in multimodal large language models (MLLMs) have aimed to
integrate and interpret data across diverse modalities. However, the capacity of these …
integrate and interpret data across diverse modalities. However, the capacity of these …
From Audio Deepfake Detection to AI-Generated Music Detection--A Pathway and Overview
As Artificial Intelligence (AI) technologies continue to evolve, their use in generating realistic,
contextually appropriate content has expanded into various domains. Music, an art form and …
contextually appropriate content has expanded into various domains. Music, an art form and …
LLaQo: Towards a Query-Based Coach in Expressive Music Performance Assessment
Research in music understanding has extensively explored composition-level attributes
such as key, genre, and instrumentation through advanced representations, leading to cross …
such as key, genre, and instrumentation through advanced representations, leading to cross …
LC-Protonets: Multi-label Few-shot learning for world music audio tagging
We introduce Label-Combination Prototypical Networks (LC-Protonets) to address the
problem of multi-label few-shot classification, where a model must generalize to new classes …
problem of multi-label few-shot classification, where a model must generalize to new classes …
[HTML][HTML] Seeing the Sound: Multilingual Lip Sync for Real-Time Face-to-Face Translation
A Rafiei Oskooei, MS Aktaş, M Keleş - Computers, 2024 - mdpi.com
Imagine a future where language is no longer a barrier to real-time conversations, enabling
instant and lifelike communication across the globe. As cultural boundaries blur, the demand …
instant and lifelike communication across the globe. As cultural boundaries blur, the demand …
Innovation, data colonialism and ethics: critical reflections on the impacts of AI on Irish traditional music
By definition, traditional music is in a constant state of friction with innovation, exemplified by
resistance to 'outside'influences such as different instruments, different ways of learning, and …
resistance to 'outside'influences such as different instruments, different ways of learning, and …
Music Genre Classification using Large Language Models
This paper exploits the zero-shot capabilities of pre-trained large language models (LLMs)
for music genre classification. The proposed approach splits audio signals into 20 ms …
for music genre classification. The proposed approach splits audio signals into 20 ms …
Hierarchical Symbolic Pop Music Generation with Graph Neural Networks
Music is inherently made up of complex structures, and representing them as graphs helps
to capture multiple levels of relationships. While music generation has been explored using …
to capture multiple levels of relationships. While music generation has been explored using …
[PDF][PDF] Towards Music Industry 5.0: Perspectives on Artificial Intelligence
A Williams, M Barthet - 2025 - researchgate.net
Artificial Intelligence (AI) is a disruptive technology that is transforming many industries
including the music industry. Recently, the concept of Industry 5.0. has been proposed …
including the music industry. Recently, the concept of Industry 5.0. has been proposed …