Self-supervised visual acoustic matching
Acoustic matching aims to re-synthesize an audio clip to sound as if it were recorded in a
target acoustic environment. Existing methods assume access to paired training data, where …
target acoustic environment. Existing methods assume access to paired training data, where …
Visual acoustic matching
We introduce the visual acoustic matching task, in which an audio clip is transformed to
sound like it was recorded in a target environment. Given an image of the target environment …
sound like it was recorded in a target environment. Given an image of the target environment …
Novel-view acoustic synthesis
We introduce the novel-view acoustic synthesis (NVAS) task: given the sight and sound
observed at a source viewpoint, can we synthesize the sound of that scene from an unseen …
observed at a source viewpoint, can we synthesize the sound of that scene from an unseen …
Rendering spatial sound for interoperable experiences in the audio metaverse
JM Jot, R Audfray, M Hertensteiner… - 2021 Immersive and …, 2021 - ieeexplore.ieee.org
Interactive audio spatialization technology previously developed for video game authoring
and rendering has evolved into an essential component of platforms enabling shared …
and rendering has evolved into an essential component of platforms enabling shared …
Mixed reality spatial audio
BL Schmidt, J Tajik, JM Jot - US Patent 10,616,705, 2020 - Google Patents
(57) ABSTRACT A method of presenting an audio signal to a user of a mixed reality
environment is disclosed. According to examples of the method, an audio event associated …
environment is disclosed. According to examples of the method, an audio event associated …
Vit-tts: visual text-to-speech with scalable diffusion transformer
Text-to-speech (TTS) has undergone remarkable improvements in performance, particularly
with the advent of Denoising Diffusion Probabilistic Models (DDPMs). However, the …
with the advent of Denoising Diffusion Probabilistic Models (DDPMs). However, the …
Blind room volume estimation from single-channel noisy speech
Recent work on acoustic parameter estimation indicates that geometric room volume can be
useful for modeling the character of an acoustic environment. However, estimating volume …
useful for modeling the character of an acoustic environment. However, estimating volume …
Blind acoustic room parameter estimation using phase features
Modeling room acoustics in a real-world settings involves some degree of blind parameter
estimation from noisy and reverberant audio. Modern approaches leverage convolutional …
estimation from noisy and reverberant audio. Modern approaches leverage convolutional …
Audio splicing detection using convolutional neural network
In an audio forensics scenario includes audio authentication in which major investigation
topic is audio tampering detection. In this paper, we present a novel method of splicing …
topic is audio tampering detection. In this paper, we present a novel method of splicing …
Mutual learning for acoustic matching and dereverberation via visual scene-driven diffusion
Visual acoustic matching (VAM) is pivotal for enhancing the immersive experience, and the
task of dereverberation is effective in improving audio intelligibility. Existing methods treat …
task of dereverberation is effective in improving audio intelligibility. Existing methods treat …