محقق Google

A Blattmann, R Rombach, H Ling… - Proceedings of the …, 2023‏ - openaccess.thecvf.com‏

Abstract Latent Diffusion Models (LDMs) enable high-quality image synthesis while avoiding
excessive compute demands by training a diffusion model in a compressed lower …‏

ذخیره ارجاع بیان شده در 950 یافته مقاله‌های مربوط تمام نسخه‌های 7 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Stable video diffusion: Scaling latent video diffusion models to large datasets‏

A Blattmann, T Dockhorn, S Kulal… - arxiv preprint arxiv …, 2023‏ - arxiv.org‏

We present Stable Video Diffusion-a latent video diffusion model for high-resolution, state-of-
the-art text-to-video and image-to-video generation. Recently, latent diffusion models trained …‏

ذخیره ارجاع بیان شده در 758 یافته مقاله‌های مربوط تمام نسخه‌های 3 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

Next-gpt: Any-to-any multimodal llm‏

S Wu, H Fei, L Qu, W Ji, TS Chua - Forty-first International …, 2024‏ - openreview.net‏

While recently Multimodal Large Language Models (MM-LLMs) have made exciting strides,
they mostly fall prey to the limitation of only input-side multimodal understanding, without the …‏

ذخیره ارجاع بیان شده در 505 یافته مقاله‌های مربوط تمام نسخه‌های 6 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Ai-generated content (aigc) for various data modalities: A survey‏

LG Foo, H Rahmani, J Liu - arxiv preprint arxiv:2308.14177, 2023‏ - arxiv.org‏

AI-generated content (AIGC) methods aim to produce text, images, videos, 3D assets, and
other media using AI algorithms. Due to its wide range of applications and the demonstrated …‏

ذخیره ارجاع بیان شده در 28 یافته مقاله‌های مربوط تمام نسخه‌های 3 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Dynamicrafter: Animating open-domain images with video diffusion priors‏

J **ng, M **a, Y Zhang, H Chen, W Yu, H Liu… - … on Computer Vision, 2024‏ - Springer‏

Animating a still image offers an engaging visual experience. Traditional image animation
techniques mainly focus on animating natural scenes with stochastic dynamics (eg clouds …‏

ذخیره ارجاع بیان شده در 169 یافته مقاله‌های مربوط تمام نسخه‌های 6

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Conditional image-to-video generation with latent flow diffusion models‏

H Ni, C Shi, K Li, SX Huang… - Proceedings of the IEEE …, 2023‏ - openaccess.thecvf.com‏

Conditional image-to-video (cI2V) generation aims to synthesize a new plausible video
starting from an image (eg, a person's face) and a condition (eg, an action class label like …‏

ذخیره ارجاع بیان شده در 149 یافته مقاله‌های مربوط تمام نسخه‌های 7 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Driving into the future: Multiview visual forecasting and planning with world model for autonomous driving‏

Y Wang, J He, L Fan, H Li, Y Chen… - Proceedings of the …, 2024‏ - openaccess.thecvf.com‏

In autonomous driving predicting future events in advance and evaluating the foreseeable
risks empowers autonomous vehicles to plan their actions enhancing safety and efficiency …‏

ذخیره ارجاع بیان شده در 84 یافته مقاله‌های مربوط تمام نسخه‌های 6 نسخه HTML

[Free GPT-4]
[DeepSeek]

[HTML] mdpi.com

[HTML][HTML] Diffusion probabilistic modeling for video generation‏

R Yang, P Srivastava, S Mandt - Entropy, 2023‏ - mdpi.com‏

Denoising diffusion probabilistic models are a promising new class of generative models
that mark a milestone in high-quality image generation. This paper showcases their ability to …‏

ذخیره ارجاع بیان شده در 244 یافته مقاله‌های مربوط تمام نسخه‌های 12 ذخیره‌شده

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Generative image dynamics‏

Z Li, R Tucker, N Snavely… - Proceedings of the IEEE …, 2024‏ - openaccess.thecvf.com‏

We present an approach to modeling an image-space prior on scene motion. Our prior is
learned from a collection of motion trajectories extracted from real video sequences …‏

ذخیره ارجاع بیان شده در 60 یافته مقاله‌های مربوط تمام نسخه‌های 8 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Maskvit: Masked visual pre-training for video prediction‏

A Gupta, S Tian, Y Zhang, J Wu, R Martín-Martín… - arxiv preprint arxiv …, 2022‏ - arxiv.org‏

The ability to predict future visual observations conditioned on past observations and motor
commands can enable embodied agents to plan solutions to a variety of tasks in complex …‏

ذخیره ارجاع بیان شده در 127 یافته مقاله‌های مربوط تمام نسخه‌های 4 نسخه HTML

ایجاد هشدار

ارجاع

جستجوی پیشرفته

در «کتابخانه من» ذخیره شد

Stochastic image-to-video synthesis using cinns

Align your latents: High-resolution video synthesis with latent diffusion models‏

Stable video diffusion: Scaling latent video diffusion models to large datasets‏

Next-gpt: Any-to-any multimodal llm‏

Ai-generated content (aigc) for various data modalities: A survey‏

Dynamicrafter: Animating open-domain images with video diffusion priors‏

Conditional image-to-video generation with latent flow diffusion models‏

Driving into the future: Multiview visual forecasting and planning with world model for autonomous driving‏

[HTML][HTML] Diffusion probabilistic modeling for video generation‏

Generative image dynamics‏

Maskvit: Masked visual pre-training for video prediction‏