الباحث العلمي من Google

Turnitin 降AI改写早检测系统早降重系统 Turnitin-UK版万方检测-期刊版维普编辑部版 Grammarly检测 Paperpass检测 checkpass检测 PaperYY检测

The rise and potential of large language model based agents: A survey‏

Z **, W Chen, X Guo, W He, Y Ding, B Hong… - Science China …, 2025‏ - Springer‏

For a long time, researchers have sought artificial intelligence (AI) that matches or exceeds
human intelligence. AI agents, which are artificial entities capable of sensing the …‏

حفظ اقتباس تم اقتباسها في عدد: 749 مقالات ذات صلة الإصدارات الـ 6كلها

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Foundations & trends in multimodal machine learning: Principles, challenges, and open questions‏

PP Liang, A Zadeh, LP Morency - ACM Computing Surveys, 2024‏ - dl.acm.org‏

Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …‏

حفظ اقتباس تم اقتباسها في عدد: 88 مقالات ذات صلة

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Fine-tuning aligned language models compromises safety, even when users do not intend to!‏

X Qi, Y Zeng, T **e, PY Chen, R Jia, P Mittal… - arxiv preprint arxiv …, 2023‏ - arxiv.org‏

Optimizing large language models (LLMs) for downstream use cases often involves the
customization of pre-trained LLMs through further fine-tuning. Meta's open release of Llama …‏

حفظ اقتباس تم اقتباسها في عدد: 451 مقالات ذات صلة الإصدارات الـ 7كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Eureka: Human-level reward design via coding large language models‏

YJ Ma, W Liang, G Wang, DA Huang, O Bastani… - arxiv preprint arxiv …, 2023‏ - arxiv.org‏

Large Language Models (LLMs) have excelled as high-level semantic planners for
sequential decision-making tasks. However, harnessing them to learn complex low-level …‏

حفظ اقتباس تم اقتباسها في عدد: 307 مقالات ذات صلة الإصدارات الـ 8كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Drivegpt4: Interpretable end-to-end autonomous driving via large language model‏

Z Xu, Y Zhang, E **e, Z Zhao, Y Guo… - IEEE Robotics and …, 2024‏ - ieeexplore.ieee.org‏

Multimodallarge language models (MLLMs) have emerged as a prominent area of interest
within the research community, given their proficiency in handling and reasoning with non …‏

حفظ اقتباس تم اقتباسها في عدد: 250 مقالات ذات صلة الإصدارات الـ 10كلها

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Spatialvlm: Endowing vision-language models with spatial reasoning capabilities‏

B Chen, Z Xu, S Kirmani, B Ichter… - Proceedings of the …, 2024‏ - openaccess.thecvf.com‏

Understanding and reasoning about spatial relationships is crucial for Visual Question
Answering (VQA) and robotics. Vision Language Models (VLMs) have shown impressive …‏

حفظ اقتباس تم اقتباسها في عدد: 142 مقالات ذات صلة الإصدارات الـ 8كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Mobile aloha: Learning bimanual mobile manipulation with low-cost whole-body teleoperation‏

Z Fu, TZ Zhao, C Finn - arxiv preprint arxiv:2401.02117, 2024‏ - arxiv.org‏

Imitation learning from human demonstrations has shown impressive performance in
robotics. However, most results focus on table-top manipulation, lacking the mobility and …‏

حفظ اقتباس تم اقتباسها في عدد: 238 مقالات ذات صلة الإصدارات الـ 3كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Drivelm: Driving with graph visual question answering‏

C Sima, K Renz, K Chitta, L Chen, H Zhang… - … on Computer Vision, 2024‏ - Springer‏

We study how vision-language models (VLMs) trained on web-scale data can be integrated
into end-to-end driving systems to boost generalization and enable interactivity with human …‏

حفظ اقتباس تم اقتباسها في عدد: 151 مقالات ذات صلة الإصدارات الـ 7كلها

[Free GPT-4]
[DeepSeek]

[PDF] aaai.org

Expel: Llm agents are experiential learners‏

A Zhao, D Huang, Q Xu, M Lin, YJ Liu… - Proceedings of the AAAI …, 2024‏ - ojs.aaai.org‏

The recent surge in research interest in applying large language models (LLMs) to decision-
making tasks has flourished by leveraging the extensive world knowledge embedded in …‏

حفظ اقتباس تم اقتباسها في عدد: 180 مقالات ذات صلة الإصدارات الـ 4كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Unified-io 2: Scaling autoregressive multimodal models with vision language audio and action‏

J Lu, C Clark, S Lee, Z Zhang… - Proceedings of the …, 2024‏ - openaccess.thecvf.com‏

We present Unified-IO 2 a multimodal and multi-skill unified model capable of following
novel instructions. Unified-IO 2 can use text images audio and/or videos as input and can …‏

حفظ اقتباس تم اقتباسها في عدد: 119 مقالات ذات صلة الإصدارات الـ 7كلها إصدار HTML‏

اقتباس

بحث متقدم

تم حفظ المقالة في مكتبتي.

The rise and potential of large language model based agents: A survey‏

Foundations & trends in multimodal machine learning: Principles, challenges, and open questions‏

Fine-tuning aligned language models compromises safety, even when users do not intend to!‏

Eureka: Human-level reward design via coding large language models‏

Drivegpt4: Interpretable end-to-end autonomous driving via large language model‏

Spatialvlm: Endowing vision-language models with spatial reasoning capabilities‏

Mobile aloha: Learning bimanual mobile manipulation with low-cost whole-body teleoperation‏

Drivelm: Driving with graph visual question answering‏

Expel: Llm agents are experiential learners‏

Unified-io 2: Scaling autoregressive multimodal models with vision language audio and action‏