Μελετητής Google

S Pascual, C Yeh, I Tsiamas, J Serrà - European Conference on Computer …, 2024 - Springer

Abstract Video-to-audio (V2A) generation leverages visual-only video features to render
plausible sounds that match the scene. Importantly, the generated sound onsets should …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 8 Σχετικά άρθρα Όλες οι 6 εκδοχές

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Foleycrafter: Bring silent videos to life with lifelike and synchronized sounds

Y Zhang, Y Gu, Y Zeng, Z **ng, Y Wang, Z Wu… - arxiv preprint arxiv …, 2024 - arxiv.org

We study Neural Foley, the automatic generation of high-quality sound effects synchronizing
with videos, enabling an immersive audio-visual experience. Despite its wide range of …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 23 Σχετικά άρθρα Όλες οι 5 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Temporally aligned audio for video with autoregression

I Viertola, V Iashin, E Rahtu - arxiv preprint arxiv:2409.13689, 2024 - arxiv.org

We introduce V-AURA, the first autoregressive model to achieve high temporal alignment
and relevance in video-to-audio generation. V-AURA uses a high-framerate visual feature …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 5 Σχετικά άρθρα Όλες οι 2 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Foleygen: Visually-guided audio generation

X Mei, V Nagaraja, G Le Lan, Z Ni… - 2024 IEEE 34th …, 2024 - ieeexplore.ieee.org

Recent advancements in audio generation tasks, such as text-to-audio and text-to-music
generation, have been spurred by the evolution of deep learning models and large-scale …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 11 Σχετικά άρθρα Όλες οι 2 εκδοχές

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Draw an audio: Leveraging multi-instruction for video-to-audio synthesis

Q Yang, B Mao, Z Wang, X Nie, P Gao, Y Guo… - arxiv preprint arxiv …, 2024 - arxiv.org

Foley is a term commonly used in filmmaking, referring to the addition of daily sound effects
to silent films or videos to enhance the auditory experience. Video-to-Audio (V2A), as a …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 4 Σχετικά άρθρα Όλες οι 3 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Video-guided foley sound generation with multimodal controls

Z Chen, P Seetharaman, B Russell, O Nieto… - arxiv preprint arxiv …, 2024 - arxiv.org

Generating sound effects for videos often requires creating artistic sound effects that diverge
significantly from real-life sources and flexible control in the sound design. To address this …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 2 Σχετικά άρθρα Όλες οι 2 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[HTML] mdpi.com

[HTML][HTML] Artificial Taste: Advances and Innovative Applications in Healthcare

L Wang, Y Li, Y Zhang, B Zheng - Applied Sciences, 2025 - mdpi.com

Background: Scientists have recently developed a technology that induces artificial taste
through electronic stimulation. However, scattered reports have made it difficult to …

Αποθήκευση Παράθεση Σχετικά άρθρα Όλες οι 2 εκδοχές Προσωρινά αποθηκευμένη

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Taming multimodal joint training for high-quality video-to-audio synthesis

HK Cheng, M Ishii, A Hayakawa, T Shibuya… - arxiv preprint arxiv …, 2024 - arxiv.org

We propose to synthesize high-quality and synchronized audio, given video and optional
text conditions, using a novel multimodal joint training framework MMAudio. In contrast to …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 2 Σχετικά άρθρα Όλες οι 2 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Gotta hear them all: Sound source aware vision to audio generation

W Guo, H Wang, W Cai, J Ma - arxiv preprint arxiv:2411.15447, 2024 - arxiv.org

Vision-to-audio (V2A) synthesis has broad applications in multimedia. Recent
advancements of V2A methods have made it possible to generate relevant audios from …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 2 Σχετικά άρθρα Όλες οι 2 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Vintage: Joint video and text conditioning for holistic audio generation

SS Kushwaha, Y Tian - arxiv preprint arxiv:2412.10768, 2024 - arxiv.org

Recent advances in audio generation have focused on text-to-audio (T2A) and video-to-
audio (V2A) tasks. However, T2A or V2A methods cannot generate holistic sounds …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 1 Σχετικά άρθρα Προβολή ως HTML

Δημιουργία ειδοποίησης

Παράθεση

Σύνθετη αναζήτηση

Αποθηκεύτηκε στη Βιβλιοθήκη μου

V2a-mapper: A lightweight solution for vision-to-audio generation by connecting foundation models

Masked generative video-to-audio transformers with enhanced synchronicity

Foleycrafter: Bring silent videos to life with lifelike and synchronized sounds

Temporally aligned audio for video with autoregression

Foleygen: Visually-guided audio generation

Draw an audio: Leveraging multi-instruction for video-to-audio synthesis

Video-guided foley sound generation with multimodal controls

[HTML][HTML] Artificial Taste: Advances and Innovative Applications in Healthcare

Taming multimodal joint training for high-quality video-to-audio synthesis

Gotta hear them all: Sound source aware vision to audio generation

Vintage: Joint video and text conditioning for holistic audio generation