Virtual instrument performances (vip): A comprehensive review

T Kyriakou, MÁ de la Campa Crespo… - Computer Graphics …, 2024 - Wiley Online Library
Driven by recent advancements in Extended Reality (XR), the hype around the Metaverse,
and real‐time computer graphics, the transformation of the performing arts, particularly in …

Language-guided audio-visual source separation via trimodal consistency

R Tan, A Ray, A Burns, BA Plummer… - Proceedings of the …, 2023 - openaccess.thecvf.com
We propose a self-supervised approach for learning to perform audio source separation in
videos based on natural language queries, using only unlabeled video and audio pairs as …

Conditioned source separation for musical instrument performances

O Slizovskaia, G Haro, E Gómez - IEEE/ACM Transactions on …, 2021 - ieeexplore.ieee.org
In music source separation, the number of sources may vary for each piece and some of the
sources may belong to the same family of instruments, thus sharing timbral characteristics …

Ta2v: Text-audio guided video generation

M Zhao, W Wang, T Chen, R Zhang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Recent conditional and unconditional video generation tasks have been accomplished
mainly based on generative adversarial network (GAN), diffusion, and autoregressive …

Multi-instrumentalist net: Unsupervised generation of music from body movements

K Su, X Liu, E Shlizerman - arxiv preprint arxiv:2012.03478, 2020 - arxiv.org
We propose a novel system that takes as an input body movements of a musician playing a
musical instrument and generates music in an unsupervised setting. Learning to generate …

Ccom-Huqin: An annotated multimodal chinese fiddle performance dataset

Y Zhang, Z Zhou, X Li, F Yu, M Sun - arxiv preprint arxiv:2209.06496, 2022 - arxiv.org
HuQin is a family of traditional Chinese bowed string instruments. Playing techniques (PTs)
embodied in various playing styles add abundant emotional coloring and aesthetic feelings …

MOSA: Music Motion with Semantic Annotation Dataset for Cross-Modal Music Processing

YF Huang, N Moran, S Coleman, J Kelly… - … on Audio, Speech …, 2024 - ieeexplore.ieee.org
In cross-modal music processing, translation between visual, auditory, and semantic content
opens up new possibilities as well as challenges. The construction of such a transformative …

Points2Sound: From mono to binaural audio using 3D point cloud scenes

F Lluís, V Chatziioannou, A Hofmann - EURASIP Journal on Audio …, 2022 - Springer
For immersive applications, the generation of binaural sound that matches its visual
counterpart is crucial to bring meaningful experiences to people in a virtual environment …

Neural music instrument cloning from few samples

N Jonason, B Sturm - 25th International Conference on Digital Audio …, 2022 - diva-portal.org
Neural music instrument cloning is an application of deep neural networks for imitating the
timbre of a particular music instrument recording with a trained neural network. One can …

Annotation-Free MIDI-to-Audio Synthesis via Concatenative Synthesis and Generative Refinement

O Take, T Akama - arxiv preprint arxiv:2410.16785, 2024 - arxiv.org
Recent MIDI-to-audio synthesis methods have employed deep neural networks to
successfully generate high-quality and expressive instrumental tracks. However, these …