- Academic Search

Zapisz Cytuj Cytowane przez 466 Powiązane artykuły Wszystkie wersje 5

Image and video compression with neural networks: A review

S Ma, X Zhang, C Jia, Z Zhao, S Wang… - IEEE Transactions on …, 2019 - ieeexplore.ieee.org

In recent years, the image and video coding technologies have advanced by leaps and
bounds. However, due to the popularization of image and video acquisition devices, the …

Zapisz Cytuj Cytowane przez 235 Powiązane artykuły Wszystkie wersje 6 Wersja HTML

Preserve your own correlation: A noise prior for video diffusion models

S Ge, S Nah, G Liu, T Poon, A Tao… - Proceedings of the …, 2023 - openaccess.thecvf.com

Despite tremendous progress in generating high-quality images using diffusion models,
synthesizing a sequence of animated frames that are both photorealistic and temporally …

Zapisz Cytuj Cytowane przez 1408 Powiązane artykuły Wszystkie wersje 4 Wersja HTML

Imagen video: High definition video generation with diffusion models

J Ho, W Chan, C Saharia, J Whang, R Gao… - arxiv preprint arxiv …, 2022 - arxiv.org

We present Imagen Video, a text-conditional video generation system based on a cascade
of video diffusion models. Given a text prompt, Imagen Video generates high definition …

Zapisz Cytuj Cytowane przez 148 Powiązane artykuły Wszystkie wersje 3 Wersja HTML

Sequential modeling enables scalable learning for large vision models

Y Bai, X Geng, K Mangalam, A Bar… - Proceedings of the …, 2024 - openaccess.thecvf.com

We introduce a novel sequential modeling approach which enables learning a Large Vision
Model (LVM) without making use of any linguistic data. To do this we define a common …

Zapisz Cytuj Cytowane przez 120 Powiązane artykuły Wszystkie wersje 2

DriveDreamer: Towards Real-World-Drive World Models for Autonomous Driving

X Wang, Z Zhu, G Huang, X Chen, J Zhu… - European Conference on …, 2024 - Springer

World models, especially in autonomous driving, are trending and drawing extensive
attention due to their capacity for comprehending driving environments. The established …

Zapisz Cytuj Cytowane przez 384 Powiązane artykuły Wszystkie wersje 5 Wersja HTML

[PDF] openreview.net

Phenaki: Variable length video generation from open domain textual descriptions

R Villegas, M Babaeizadeh, PJ Kindermans… - International …, 2022 - openreview.net

We present Phenaki, a model capable of realistic video synthesis given a sequence of
textual prompts. Generating videos from text is particularly challenging due to the …

Zapisz Cytuj Cytowane przez 73 Powiązane artykuły Wszystkie wersje 2

Factorizing text-to-video generation by explicit image conditioning

R Girdhar, M Singh, A Brown, Q Duval, S Azadi… - … on Computer Vision, 2024 - Springer

Abstract We present Emu Video, a text-to-video generation model that factorizes the
generation into two steps: first generating an image conditioned on the text, and then …

Zapisz Cytuj Cytowane przez 276 Powiązane artykuły Wszystkie wersje 6 Wersja HTML

Simvp: Simpler yet better video prediction

Z Gao, C Tan, L Wu, SZ Li - … of the IEEE/CVF conference on …, 2022 - openaccess.thecvf.com

Abstract From CNN, RNN, to ViT, we have witnessed remarkable advancements in video
prediction, incorporating auxiliary inputs, elaborate neural architectures, and sophisticated …