Google 학술 검색

Y Liu, W Chen, Y Bai, X Liang, G Li, W Gao… - arxiv preprint arxiv …, 2024 - arxiv.org

Embodied Artificial Intelligence (Embodied AI) is crucial for achieving Artificial General
Intelligence (AGI) and serves as a foundation for various applications that bridge cyberspace …

저장 인용 37회 인용 관련 학술자료 전체 3개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Drivedreamer4d: World models are effective data machines for 4d driving scene representation

G Zhao, C Ni, X Wang, Z Zhu, X Zhang, Y Wang… - arxiv preprint arxiv …, 2024 - arxiv.org

Closed-loop simulation is essential for advancing end-to-end autonomous driving systems.
Contemporary sensor simulation methods, such as NeRF and 3DGS, rely predominantly on …

저장 인용 7회 인용 관련 학술자료 전체 2개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Understanding World or Predicting Future? A Comprehensive Survey of World Models

J Ding, Y Zhang, Y Shang, Y Zhang, Z Zong… - arxiv preprint arxiv …, 2024 - arxiv.org

The concept of world models has garnered significant attention due to advancements in
multimodal large language models such as GPT-4 and video generation models such as …

저장 인용 1회 인용 관련 학술자료 전체 2개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Towards world simulator: Crafting physical commonsense-based benchmark for video generation

F Meng, J Liao, X Tan, W Shao, Q Lu, K Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org

Text-to-video (T2V) models like Sora have made significant strides in visualizing complex
prompts, which is increasingly viewed as a promising path towards constructing the …

저장 인용 7회 인용 관련 학술자료 전체 2개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Avid: Adapting video diffusion models to world models

M Rigter, T Gupta, A Hilmkil, C Ma - arxiv preprint arxiv:2410.12822, 2024 - arxiv.org

Large-scale generative models have achieved remarkable success in a number of domains.
However, for sequential decision-making problems, such as robotics, action-labelled data is …

저장 인용 4회 인용 관련 학술자료 전체 2개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Acdc: Autoregressive coherent multimodal generation using diffusion correction

H Chung, D Lee, JC Ye - arxiv preprint arxiv:2410.04721, 2024 - arxiv.org

Autoregressive models (ARMs) and diffusion models (DMs) represent two leading
paradigms in generative modeling, each excelling in distinct areas: ARMs in global context …

저장 인용 2회 인용 관련 학술자료 전체 3개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Drivinggpt: Unifying driving world modeling and planning with multi-modal autoregressive transformers

Y Chen, Y Wang, Z Zhang - arxiv preprint arxiv:2412.18607, 2024 - arxiv.org

World model-based searching and planning are widely recognized as a promising path
toward human-level physical intelligence. However, current driving world models primarily …

저장 인용 3회 인용 관련 학술자료 전체 4개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

ReconDreamer: Crafting World Models for Driving Scene Reconstruction via Online Restoration

C Ni, G Zhao, X Wang, Z Zhu, W Qin, G Huang… - arxiv preprint arxiv …, 2024 - arxiv.org

Closed-loop simulation is crucial for end-to-end autonomous driving. Existing sensor
simulation methods (eg, NeRF and 3DGS) reconstruct driving scenes based on conditions …

저장 인용 1회 인용 관련 학술자료 전체 2개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Autoregressive Models in Vision: A Survey

J **ong, G Liu, L Huang, C Wu, T Wu, Y Mu… - arxiv preprint arxiv …, 2024 - arxiv.org

Autoregressive modeling has been a huge success in the field of natural language
processing (NLP). Recently, autoregressive models have emerged as a significant area of …

저장 인용 1회 인용 관련 학술자료 전체 3개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions

X Chi, Y Wang, A Cheng, P Fang, Z Tian, Y He… - arxiv preprint arxiv …, 2024 - arxiv.org

Massive multi-modality datasets play a significant role in facilitating the success of large
video-language models. However, current video-language datasets primarily provide text …

저장 인용 1회 인용 관련 학술자료 전체 3개의 버전 HTML 버전

알림 만들기

인용

고급 검색

라이브러리에 저장됨

Pandora: Towards General World Model with Natural Language Actions and Video States

Aligning cyber space with physical world: A comprehensive survey on embodied ai

Drivedreamer4d: World models are effective data machines for 4d driving scene representation

Understanding World or Predicting Future? A Comprehensive Survey of World Models

Towards world simulator: Crafting physical commonsense-based benchmark for video generation

Avid: Adapting video diffusion models to world models

Acdc: Autoregressive coherent multimodal generation using diffusion correction

Drivinggpt: Unifying driving world modeling and planning with multi-modal autoregressive transformers

ReconDreamer: Crafting World Models for Driving Scene Reconstruction via Online Restoration

Autoregressive Models in Vision: A Survey

MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions