Google Наука

Z **ng, Q Feng, H Chen, Q Dai, H Hu, H Xu… - ACM Computing …, 2024 - dl.acm.org

The recent wave of AI-generated content (AIGC) has witnessed substantial success in
computer vision, with the diffusion model playing a crucial role in this achievement. Due to …

Запазване Позоваване С позовавания в 98 Сродни статии Всички 4 версии

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

Perceptual video quality assessment: A survey

X Min, H Duan, W Sun, Y Zhu, G Zhai - Science China Information …, 2024 - Springer

Perceptual video quality assessment plays a vital role in the field of video processing due to
the existence of quality degradations introduced in various stages of video signal …

Запазване Позоваване С позовавания в 86 Сродни статии Всички 4 версии

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Lavie: High-quality video generation with cascaded latent diffusion models

Y Wang, X Chen, X Ma, S Zhou, Z Huang… - International Journal of …, 2024 - Springer

This work aims to learn a high-quality text-to-video (T2V) generative model by leveraging a
pre-trained text-to-image (T2I) model as a basis. It is a highly desirable yet challenging task …

Запазване Позоваване С позовавания в 232 Сродни статии Всички 4 версии

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Freeu: Free lunch in diffusion u-net

C Si, Z Huang, Y Jiang, Z Liu - Proceedings of the IEEE/CVF …, 2024 - openaccess.thecvf.com

In this paper we uncover the untapped potential of diffusion U-Net which serves as a" free
lunch" that substantially improves the generation quality on the fly. We initially investigate …

Запазване Позоваване С позовавания в 114 Сродни статии Всички 5 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Show-1: Marrying pixel and latent diffusion models for text-to-video generation

DJ Zhang, JZ Wu, JW Liu, R Zhao, L Ran, Y Gu… - International Journal of …, 2024 - Springer

Significant advancements have been achieved in the realm of large-scale pre-trained text-to-
video Diffusion Models (VDMs). However, previous methods either rely solely on pixel …

Запазване Позоваване С позовавания в 172 Сродни статии Всички 3 версии

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Motionctrl: A unified and flexible motion controller for video generation

Z Wang, Z Yuan, X Wang, Y Li, T Chen, M **a… - ACM SIGGRAPH 2024 …, 2024 - dl.acm.org

Motions in a video primarily consist of camera motion, induced by camera movement, and
object motion, resulting from object movement. Accurate control of both camera and object …

Запазване Позоваване С позовавания в 131 Сродни статии Всички 5 версии

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Miradata: A large-scale video dataset with long durations and structured captions

X Ju, Y Gao, Z Zhang, Z Yuan… - Advances in …, 2025 - proceedings.neurips.cc

Sora's high-motion intensity and long consistent videos have significantly impacted the field
of video generation, attracting unprecedented attention. However, existing publicly available …

Запазване Позоваване С позовавания в 30 Сродни статии Всички 4 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Emu3: Next-token prediction is all you need

X Wang, X Zhang, Z Luo, Q Sun, Y Cui, J Wang… - arxiv preprint arxiv …, 2024 - arxiv.org

While next-token prediction is considered a promising path towards artificial general
intelligence, it has struggled to excel in multimodal tasks, which are still dominated by …

Запазване Позоваване С позовавания в 77 Сродни статии Всички 3 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Videobooth: Diffusion-based video generation with image prompts

Y Jiang, T Wu, S Yang, C Si, D Lin… - Proceedings of the …, 2024 - openaccess.thecvf.com

Text-driven video generation witnesses rapid progress. However merely using text prompts
is not enough to depict the desired subject appearance that accurately aligns with users' …

Запазване Позоваване С позовавания в 50 Сродни статии Всички 7 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Cove: Unleashing the diffusion feature correspondence for consistent video editing

J Wang, Y Ma, J Guo, Y **ao… - Advances in Neural …, 2025 - proceedings.neurips.cc

Video editing is an emerging task, in which most current methods adopt the pre-trained text-
to-image (T2I) diffusion model to edit the source video in a zero-shot manner. Despite …

Запазване Позоваване С позовавания в 15 Сродни статии Всички 4 версии Във вид на HTML

Създаване на сигнал

Позоваване

Разширено търсене

Запазено в „Моята библиотека“

Vbench: Comprehensive benchmark suite for video generative models

A survey on video diffusion models

Perceptual video quality assessment: A survey

Lavie: High-quality video generation with cascaded latent diffusion models

Freeu: Free lunch in diffusion u-net

Show-1: Marrying pixel and latent diffusion models for text-to-video generation

Motionctrl: A unified and flexible motion controller for video generation

Miradata: A large-scale video dataset with long durations and structured captions

Emu3: Next-token prediction is all you need

Videobooth: Diffusion-based video generation with image prompts

Cove: Unleashing the diffusion feature correspondence for consistent video editing