محقق Google

An overview of diffusion models: Applications, guided generation, statistical rates and optimization‏

M Chen, S Mei, J Fan, M Wang - arxiv preprint arxiv:2404.07771, 2024‏ - arxiv.org‏

Diffusion models, a powerful and universal generative AI technology, have achieved
tremendous success in computer vision, audio, reinforcement learning, and computational …‏

ذخیره ارجاع بیان شده در 53 یافته مقاله‌های مربوط تمام نسخه‌های 2 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The llama 3 herd of models‏

A Dubey, A Jauhri, A Pandey, A Kadian… - arxiv preprint arxiv …, 2024‏ - arxiv.org‏

Modern artificial intelligence (AI) systems are powered by foundation models. This paper
presents a new set of foundation models, called Llama 3. It is a herd of language models …‏

ذخیره ارجاع بیان شده در 2794 یافته مقاله‌های مربوط تمام نسخه‌های 4 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Lavie: High-quality video generation with cascaded latent diffusion models‏

Y Wang, X Chen, X Ma, S Zhou, Z Huang… - International Journal of …, 2024‏ - Springer‏

This work aims to learn a high-quality text-to-video (T2V) generative model by leveraging a
pre-trained text-to-image (T2I) model as a basis. It is a highly desirable yet challenging task …‏

ذخیره ارجاع بیان شده در 228 یافته مقاله‌های مربوط تمام نسخه‌های 4

[Free GPT-4]
[DeepSeek]

[PDF] oup.com

Opportunities and challenges of diffusion models for generative AI‏

M Chen, S Mei, J Fan, M Wang - National Science Review, 2024‏ - academic.oup.com‏

Diffusion models, a powerful and universal generative artificial intelligence technology, have
achieved tremendous success and opened up new possibilities in diverse applications. In …‏

ذخیره ارجاع بیان شده در 9 یافته مقاله‌های مربوط تمام نسخه‌های 7

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Fast high-resolution image synthesis with latent adversarial diffusion distillation‏

A Sauer, F Boesel, T Dockhorn, A Blattmann… - SIGGRAPH Asia 2024 …, 2024‏ - dl.acm.org‏

Diffusion models are the main driver of progress in image and video synthesis, but suffer
from slow inference speed. Distillation methods, like the recently introduced adversarial …‏

ذخیره ارجاع بیان شده در 78 یافته مقاله‌های مربوط تمام نسخه‌های 3

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Miradata: A large-scale video dataset with long durations and structured captions‏

X Ju, Y Gao, Z Zhang, Z Yuan… - Advances in …, 2025‏ - proceedings.neurips.cc‏

Sora's high-motion intensity and long consistent videos have significantly impacted the field
of video generation, attracting unprecedented attention. However, existing publicly available …‏

ذخیره ارجاع بیان شده در 30 یافته مقاله‌های مربوط تمام نسخه‌های 4 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Show-o: One single transformer to unify multimodal understanding and generation‏

J **e, W Mao, Z Bai, DJ Zhang, W Wang, KQ Lin… - arxiv preprint arxiv …, 2024‏ - arxiv.org‏

We present a unified transformer, ie, Show-o, that unifies multimodal understanding and
generation. Unlike fully autoregressive models, Show-o unifies autoregressive and …‏

ذخیره ارجاع بیان شده در 86 یافته مقاله‌های مربوط تمام نسخه‌های 4 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Discrete flow matching‏

I Gat, T Remez, N Shaul, F Kreuk… - Advances in …, 2025‏ - proceedings.neurips.cc‏

Abstract Despite Flow Matching and diffusion models having emerged as powerful
generative paradigms for continuous variables such as images and videos, their application …‏

ذخیره ارجاع بیان شده در 30 یافته مقاله‌های مربوط تمام نسخه‌های 5 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Emu3: Next-token prediction is all you need‏

X Wang, X Zhang, Z Luo, Q Sun, Y Cui, J Wang… - arxiv preprint arxiv …, 2024‏ - arxiv.org‏

While next-token prediction is considered a promising path towards artificial general
intelligence, it has struggled to excel in multimodal tasks, which are still dominated by …‏

ذخیره ارجاع بیان شده در 76 یافته مقاله‌های مربوط تمام نسخه‌های 3 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

On statistical rates and provably efficient criteria of latent diffusion transformers (dits)‏

JYC Hu, W Wu, Z Li, S Pi, Z Song… - Advances in Neural …, 2025‏ - proceedings.neurips.cc‏

We investigate the statistical and computational limits of latent Diffusion Transformers (DiTs)
under the low-dimensional linear latent space assumption. Statistically, we study the …‏

ذخیره ارجاع بیان شده در 21 یافته مقاله‌های مربوط تمام نسخه‌های 5 نسخه HTML

ایجاد هشدار

ارجاع

جستجوی پیشرفته

در «کتابخانه من» ذخیره شد

Scaling rectified flow transformers for high-resolution image synthesis

An overview of diffusion models: Applications, guided generation, statistical rates and optimization‏

The llama 3 herd of models‏

Lavie: High-quality video generation with cascaded latent diffusion models‏

Opportunities and challenges of diffusion models for generative AI‏

Fast high-resolution image synthesis with latent adversarial diffusion distillation‏

Miradata: A large-scale video dataset with long durations and structured captions‏

Show-o: One single transformer to unify multimodal understanding and generation‏

Discrete flow matching‏

Emu3: Next-token prediction is all you need‏

On statistical rates and provably efficient criteria of latent diffusion transformers (dits)‏