Google Наука

K Tian, Y Jiang, Z Yuan, B Peng… - Advances in neural …, 2025 - proceedings.neurips.cc

Abstract We present Visual AutoRegressive modeling (VAR), a new generation paradigm
that redefines the autoregressive learning on images as coarse-to-fine" next-scale …

Запазване Позоваване С позовавания в 159 Сродни статии Всички 5 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Simplified and generalized masked diffusion for discrete data

J Shi, K Han, Z Wang, A Doucet… - Advances in Neural …, 2025 - proceedings.neurips.cc

Masked (or absorbing) diffusion is actively explored as an alternative to autoregressive
models for generative modeling of discrete data. However, existing work in this area has …

Запазване Позоваване С позовавания в 34 Сродни статии Всички 4 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Freelong: Training-free long video generation with spectralblend temporal attention

Y Lu, Y Liang, L Zhu, Y Yang - Advances in Neural …, 2025 - proceedings.neurips.cc

Video diffusion models have made substantial progress in various video generation
applications. However, training models for long video generation tasks require significant …

Запазване Позоваване С позовавания в 14 Сродни статии Всички 4 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Vidu4d: Single generated video to high-fidelity 4d reconstruction with dynamic gaussian surfels

Y Wang, X Wang, Z Chen, Z Wang… - Advances in Neural …, 2025 - proceedings.neurips.cc

Video generative models are receiving particular attention given their ability to generate
realistic and imaginative frames. Besides, these models are also observed to exhibit strong …

Запазване Позоваване С позовавания в 14 Сродни статии Всички 5 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

T2vsafetybench: Evaluating the safety of text-to-video generative models

Y Miao, Y Zhu, L Yu, J Zhu, XS Gao… - Advances in Neural …, 2025 - proceedings.neurips.cc

The recent development of Sora leads to a new era in text-to-video (T2V) generation. Along
with this comes the rising concern about its safety risks. The generated videos may contain …

Запазване Позоваване С позовавания в 7 Сродни статии Всички 4 версии Във вид на HTML

Mimicmotion: High-quality human motion video generation with confidence-aware pose guidance

Y Zhang, J Gu, LW Wang, H Wang, J Cheng… - arxiv preprint arxiv …, 2024 - arxiv.org

In recent years, generative artificial intelligence has achieved significant advancements in
the field of image generation, spawning a variety of applications. However, video generation …

Запазване Позоваване С позовавания в 39 Сродни статии Всички 3 версии Кеширана версия

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Neural residual diffusion models for deep scalable vision generation

Z Ma, L Zhao, B Qi, B Zhou - Advances in Neural …, 2025 - proceedings.neurips.cc

The most advanced diffusion models have recently adopted increasingly deep stacked
networks (eg, U-Net or Transformer) to promote the generative emergence capabilities of …

Запазване Позоваване С позовавания в 3 Сродни статии Всички 5 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Pandora: Towards general world model with natural language actions and video states

J **ang, G Liu, Y Gu, Q Gao, Y Ning, Y Zha… - arxiv preprint arxiv …, 2024 - arxiv.org

World models simulate future states of the world in response to different actions. They
facilitate interactive content creation and provides a foundation for grounded, long-horizon …

Запазване Позоваване С позовавания в 20 Сродни статии Всички 4 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Infinity: Scaling bitwise autoregressive modeling for high-resolution image synthesis

J Han, J Liu, Y Jiang, B Yan, Y Zhang, Z Yuan… - arxiv preprint arxiv …, 2024 - arxiv.org

We present Infinity, a Bitwise Visual AutoRegressive Modeling capable of generating high-
resolution, photorealistic images following language instruction. Infinity redefines visual …

Запазване Позоваване С позовавания в 9 Сродни статии Всички 2 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Od-vae: An omni-dimensional video compressor for improving latent video diffusion model

L Chen, Z Li, B Lin, B Zhu, Q Wang, S Yuan… - arxiv preprint arxiv …, 2024 - arxiv.org

Variational Autoencoder (VAE), compressing videos into latent representations, is a crucial
preceding component of Latent Video Diffusion Models (LVDMs). With the same …

Запазване Позоваване С позовавания в 9 Сродни статии Всички 3 версии Във вид на HTML

Създаване на сигнал

Позоваване

Разширено търсене

Запазено в „Моята библиотека“

Vidu: a highly consistent, dynamic and skilled text-to-video generator with diffusion models

Visual autoregressive modeling: Scalable image generation via next-scale prediction

Simplified and generalized masked diffusion for discrete data

Freelong: Training-free long video generation with spectralblend temporal attention

Vidu4d: Single generated video to high-fidelity 4d reconstruction with dynamic gaussian surfels

T2vsafetybench: Evaluating the safety of text-to-video generative models

Mimicmotion: High-quality human motion video generation with confidence-aware pose guidance

Neural residual diffusion models for deep scalable vision generation

Pandora: Towards general world model with natural language actions and video states

Infinity: Scaling bitwise autoregressive modeling for high-resolution image synthesis

Od-vae: An omni-dimensional video compressor for improving latent video diffusion model