Understanding World or Predicting Future? A Comprehensive Survey of World Models

J Ding, Y Zhang, Y Shang, Y Zhang, Z Zong… - arxiv preprint arxiv …, 2024 - arxiv.org
The concept of world models has garnered significant attention due to advancements in
multimodal large language models such as GPT-4 and video generation models such as …

Spaceblender: Creating context-rich collaborative spaces through generative 3d scene blending

N Numan, S Rajaram, BT Kumaravel… - Proceedings of the 37th …, 2024 - dl.acm.org
There is increased interest in using generative AI to create 3D spaces for Virtual Reality (VR)
applications. However, today's models produce artificial environments, falling short of …

Crossviewdiff: A cross-view diffusion model for satellite-to-street view synthesis

W Li, J He, J Ye, H Zhong, Z Zheng, Z Huang… - arxiv preprint arxiv …, 2024 - arxiv.org
Satellite-to-street view synthesis aims at generating a realistic street-view image from its
corresponding satellite-view image. Although stable diffusion models have exhibit …

StarGen: A Spatiotemporal Autoregression Framework with Video Diffusion Model for Scalable and Controllable Scene Generation

S Zhai, Z Ye, J Liu, W **e, J Hu, Z Peng, H Xue… - arxiv preprint arxiv …, 2025 - arxiv.org
Recent advances in large reconstruction and generative models have significantly improved
scene reconstruction and novel view generation. However, due to compute limitations, each …

[PDF][PDF] Lifelong Learning of Video Diffusion Models From a Single Video Stream

J Yoo, Y He, S Naderiparizi, D Green, GM van de Ven… - 2024 - lirias.kuleuven.be
This work demonstrates that training autoregressive video diffusion models from a single,
continuous video stream is not only possible but remarkably can also be competitive with …