- Academic Search

Worldgpt: Empowering llm as multimodal world model

Turnitin 降AI改写早检测系统早降重系统 Turnitin-UK版万方检测-期刊版维普编辑部版 Grammarly检测 Paperpass检测 checkpass检测 PaperYY检测

A survey on potentials, pathways and challenges of large language models in new-generation intelligent manufacturing

C Zhang, Q Xu, Y Yu, G Zhou, K Zeng, F Chang… - Robotics and Computer …, 2025 - Elsevier

Abstract Nowadays, Industry 5.0 starts to gain attention, which advocates that intelligent
manufacturing should adequately consider the roles and needs of humans. In this context …

บันทึก อ้างอิง อ้างโดย5 บทความที่เกี่ยวข้อง

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Momentor: Advancing video large language model with fine-grained temporal reasoning

L Qian, J Li, Y Wu, Y Ye, H Fei, TS Chua… - arxiv preprint arxiv …, 2024 - arxiv.org

Large Language Models (LLMs) demonstrate remarkable proficiency in comprehending and
handling text-based tasks. Many efforts are being made to transfer these attributes to video …

บันทึก อ้างอิง อ้างโดย29 บทความที่เกี่ยวข้อง ทั้งหมด 7 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] researchsquare.com

Evaluating the impact of environmental semantic distractions on multimodal large language models

S Kuhozido, G Dunfield, E Ostrich, C Waterhouse - 2024 - researchsquare.com

Multimodal models integrating visual and textual data have transformed artificial intelligence
applications by providing more holistic and contextually aware responses. However, the …

บันทึก อ้างอิง อ้างโดย55 บทความที่เกี่ยวข้อง ทั้งหมด 2 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Auto-encoding morph-tokens for multimodal llm

K Pan, S Tang, J Li, Z Fan, W Chow, S Yan… - arxiv preprint arxiv …, 2024 - arxiv.org

For multimodal LLMs, the synergy of visual comprehension (textual output) and generation
(visual output) presents an ongoing challenge. This is due to a conflicting objective: for …

บันทึก อ้างอิง อ้างโดย14 บทความที่เกี่ยวข้อง ทั้งหมด 7 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Unified generative and discriminative training for multi-modal large language models

W Chow, J Li, Q Yu, K Pan, H Fei… - Advances in …, 2025 - proceedings.neurips.cc

Abstract In recent times, Vision-Language Models (VLMs) have been trained under two
predominant paradigms. Generative training has enabled Multimodal Large Language …

บันทึก อ้างอิง อ้างโดย2 บทความที่เกี่ยวข้อง ทั้งหมด 5 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Fact: Teaching mllms with faithful, concise and transferable rationales

M Gao, S Chen, L Pang, Y Yao, J Dang… - Proceedings of the …, 2024 - dl.acm.org

The remarkable performance of Multimodal Large Language Models (MLLMs) has
demonstrated their proficient understanding capabilities in handling various visual tasks …

บันทึก อ้างอิง อ้างโดย5 บทความที่เกี่ยวข้อง ทั้งหมด 5 ฉบับ

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective

Z Qin, D Chen, W Zhang, L Yao, Y Huang… - arxiv preprint arxiv …, 2024 - arxiv.org

The rapid development of large language models (LLMs) has been witnessed in recent
years. Based on the powerful LLMs, multi-modal LLMs (MLLMs) extend the modality from …

บันทึก อ้างอิง อ้างโดย5 บทความที่เกี่ยวข้อง ทั้งหมด 3 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A survey on multimodal benchmarks: In the era of large ai models

L Li, G Chen, H Shi, J **ao, L Chen - arxiv preprint arxiv:2409.18142, 2024 - arxiv.org

The rapid evolution of Multimodal Large Language Models (MLLMs) has brought substantial
advancements in artificial intelligence, significantly enhancing the capability to understand …

บันทึก อ้างอิง อ้างโดย4 บทความที่เกี่ยวข้อง ทั้งหมด 2 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Acdc: Autoregressive coherent multimodal generation using diffusion correction

H Chung, D Lee, JC Ye - arxiv preprint arxiv:2410.04721, 2024 - arxiv.org

Autoregressive models (ARMs) and diffusion models (DMs) represent two leading
paradigms in generative modeling, each excelling in distinct areas: ARMs in global context …

บันทึก อ้างอิง อ้างโดย2 บทความที่เกี่ยวข้อง ทั้งหมด 3 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Wall-e: World alignment by rule learning improves world model-based llm agents

S Zhou, T Zhou, Y Yang, G Long, D Ye, J Jiang… - arxiv preprint arxiv …, 2024 - arxiv.org

Can large language models (LLMs) directly serve as powerful world models for model-
based agents? While the gaps between the prior knowledge of LLMs and the specified …

บันทึก อ้างอิง อ้างโดย2 บทความที่เกี่ยวข้อง ทั้งหมด 3 ฉบับ ดูในรูปแบบ HTML

สร้างการแจ้งเตือน

อ้างอิง

การค้นหาขั้นสูง

บันทึกไปยังคลังของฉันแล้ว

Worldgpt: Empowering llm as multimodal world model

A survey on potentials, pathways and challenges of large language models in new-generation intelligent manufacturing

Momentor: Advancing video large language model with fine-grained temporal reasoning

Evaluating the impact of environmental semantic distractions on multimodal large language models

Auto-encoding morph-tokens for multimodal llm

Unified generative and discriminative training for multi-modal large language models

Fact: Teaching mllms with faithful, concise and transferable rationales

The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective

A survey on multimodal benchmarks: In the era of large ai models

Acdc: Autoregressive coherent multimodal generation using diffusion correction

Wall-e: World alignment by rule learning improves world model-based llm agents