Aligning cyber space with physical world: A comprehensive survey on embodied ai
Embodied Artificial Intelligence (Embodied AI) is crucial for achieving Artificial General
Intelligence (AGI) and serves as a foundation for various applications that bridge cyberspace …
Intelligence (AGI) and serves as a foundation for various applications that bridge cyberspace …
Vd3d: Taming large video diffusion transformers for 3d camera control
Modern text-to-video synthesis models demonstrate coherent, photorealistic generation of
complex videos from a text description. However, most existing models lack fine-grained …
complex videos from a text description. However, most existing models lack fine-grained …
AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers
Numerous works have recently integrated 3D camera control into foundational text-to-video
models, but the resulting camera control is often imprecise, and video generation quality …
models, but the resulting camera control is often imprecise, and video generation quality …