An overview of lead and accompaniment separation in music

Z Rafii, A Liutkus, FR Stöter, SI Mimilakis… - … on Audio, Speech …, 2018 - ieeexplore.ieee.org
Popular music is often composed of an accompaniment and a lead component, the latter
typically consisting of vocals. Filtering such mixtures to extract one or both components has …

Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis

P Ochieng - Artificial Intelligence Review, 2023 - Springer
Deep neural networks (DNN) techniques have become pervasive in domains such as
natural language processing and computer vision. They have achieved great success in …

Music source separation with band-split RNN

Y Luo, J Yu - IEEE/ACM Transactions on Audio, Speech, and …, 2023 - ieeexplore.ieee.org
The performance of music source separation (MSS) models has been greatly improved in
recent years thanks to the development of novel neural network architectures and training …

The sound of pixels

H Zhao, C Gan, A Rouditchenko… - Proceedings of the …, 2018 - openaccess.thecvf.com
We introduce PixelPlayer, a system that, by leveraging large amounts of unlabeled videos,
learns to locate image regions which produce sounds and separate the input sounds into a …

[PDF][PDF] Open-unmix-a reference implementation for music source separation

FR Stöter, S Uhlich, A Liutkus… - Journal of Open Source …, 2019 - joss.theoj.org
Music source separation is the task of decomposing music into its constitutive components,
eg, yielding separated stems for the vocals, bass, and drums. Such a separation has many …

Singing voice separation with deep u-net convolutional networks

A Jansson, E Humphrey, N Montecchio, R Bittner… - 2017 - openaccess.city.ac.uk
The decomposition of a music audio signal into its vocal and backing track components is
analogous to image-to-image translation, where a mixed spectrogram is transformed into its …

A wavenet for speech denoising

D Rethage, J Pons, X Serra - 2018 IEEE International …, 2018 - ieeexplore.ieee.org
Most speech processing techniques use magnitude spectrograms as front-end and are
therefore by default discarding part of the signal: the phase. In order to overcome this …

Music gesture for visual sound separation

C Gan, D Huang, H Zhao… - Proceedings of the …, 2020 - openaccess.thecvf.com
Recent deep learning approaches have achieved impressive performance on visual sound
separation tasks. However, these approaches are mostly built on appearance and optical …

The sound of motions

H Zhao, C Gan, WC Ma… - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com
Sounds originate from object motions and vibrations of surrounding air. Inspired by the fact
that humans is capable of interpreting sound sources from how objects move visually, we …

Universal sound separation

I Kavalerov, S Wisdom, H Erdogan… - … IEEE Workshop on …, 2019 - ieeexplore.ieee.org
Recent deep learning approaches have achieved impressive performance on speech
enhancement and separation tasks. However, these approaches have not been investigated …