- Academic Search

W Zhang, K Tang, H Wu, M Wang, Y Shen… - arxiv preprint arxiv …, 2024 - arxiv.org

Large Language Models exhibit robust problem-solving capabilities for diverse tasks.
However, most LLM-based agents are designed as specific task solvers with sophisticated …

Save Cite Cited by 28 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Robogolf: Mastering real-world minigolf with a reflective multi-modality vision-language model

H Zhou, T Ji, L Sommerhalder, M Goerner… - arxiv preprint arxiv …, 2024 - arxiv.org

Minigolf is an exemplary real-world game for examining embodied intelligence, requiring
challenging spatial and kinodynamic understanding to putt the ball. Additionally, reflective …

Save Cite Cited by 2 Related articles View as HTML

[Free GPT-4]

[PDF] arxiv.org

Clickagent: Enhancing ui location capabilities of autonomous agents

J Hoscilowicz, B Maj, B Kozakiewicz… - arxiv preprint arxiv …, 2024 - arxiv.org

With the growing reliance on digital devices equipped with graphical user interfaces (GUIs),
such as computers and smartphones, the need for effective automation tools has become …

Save Cite Cited by 2 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents

Z Wang, S Cai, Z Mu, H Lin, C Zhang, X Liu, Q Li… - arxiv preprint arxiv …, 2024 - arxiv.org

This paper presents OmniJARVIS, a novel Vision-Language-Action (VLA) model for open-
world instruction-following agents in Minecraft. Compared to prior works that either emit …

[Free GPT-4]

[PDF] ieee.org

Autonomous Mental Development at the Individual and Collective Levels: Concept and Challenges

M Lippi, S Mariani, M Martinelli, F Zambonelli - IEEE Access, 2024 - ieeexplore.ieee.org

The increasing complexity and unpredictability of many ICT scenarios let us envision that
future systems will have to dynamically learn how to act and adapt to face evolving situations …

A taxonomy of architecture options for foundation model-based agents: Analysis and decision model

J Zhou, Q Lu, J Chen, L Zhu, X Xu, Z **ng… - arxiv preprint arxiv …, 2024 - arxiv.org

The rapid advancement of AI technology has led to widespread applications of agent
systems across various domains. However, the need for detailed architecture design poses …

Save Cite Cited by 1 Related articles All 2 versions Free GPT-4 Cached

[Free GPT-4]

[PDF] embopress.org

AI to publish knowledge: a tectonic shift

T Lemberger - EMBO reports, 2024 - embopress.org

The rise of generative AI will transform scientific publishing but it also poses risks. While AI
enables the dissemination of knowledge in computable form, preserving transparency and …

[Free GPT-4]

[PDF] arxiv.org

Position: Foundation Agents as the Paradigm Shift for Decision Making

X Liu, X Lou, J Jiao, J Zhang - arxiv preprint arxiv:2405.17009, 2024 - arxiv.org

Decision making demands intricate interplay between perception, memory, and reasoning to
discern optimal policies. Conventional approaches to decision making face challenges …

Save Cite Cited by 2 Related articles All 3 versions Free GPT-4 View as HTML

Smart Mobility with Agent-based Foundation Models: Towards Interactive and Collaborative Intelligent Vehicles

B **a, P **e, J Wang - IEEE Transactions on Intelligent Vehicles, 2024 - ieeexplore.ieee.org

This letter reports the insights gained during a Distributed/Decentralized Hybrid Workshop
on Foundation/Infrastructure Intelligence (FII), where we discussed the evolving role of …

Save Cite Related articles

[Free GPT-4]

[PDF] arxiv.org

Large Physics Models: Towards a collaborative approach with Large Language Models and Foundation Models

KG Barman, S Caron, E Sullivan, HW de Regt… - arxiv preprint arxiv …, 2025 - arxiv.org

This paper explores ideas and provides a potential roadmap for the development and
evaluation of physics-specific large-scale AI models, which we call Large Physics Models …

Create alert

Cite

Advanced search

Saved to My library

An interactive agent foundation model

Agent-pro: Learning to evolve via policy-level reflection and optimization

Robogolf: Mastering real-world minigolf with a reflective multi-modality vision-language model

Clickagent: Enhancing ui location capabilities of autonomous agents

OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents

Autonomous Mental Development at the Individual and Collective Levels: Concept and Challenges

A taxonomy of architecture options for foundation model-based agents: Analysis and decision model

AI to publish knowledge: a tectonic shift

Position: Foundation Agents as the Paradigm Shift for Decision Making

Smart Mobility with Agent-based Foundation Models: Towards Interactive and Collaborative Intelligent Vehicles

Large Physics Models: Towards a collaborative approach with Large Language Models and Foundation Models