Agent-pro: Learning to evolve via policy-level reflection and optimization
Large Language Models exhibit robust problem-solving capabilities for diverse tasks.
However, most LLM-based agents are designed as specific task solvers with sophisticated …
However, most LLM-based agents are designed as specific task solvers with sophisticated …
Robogolf: Mastering real-world minigolf with a reflective multi-modality vision-language model
Minigolf is an exemplary real-world game for examining embodied intelligence, requiring
challenging spatial and kinodynamic understanding to putt the ball. Additionally, reflective …
challenging spatial and kinodynamic understanding to putt the ball. Additionally, reflective …
Clickagent: Enhancing ui location capabilities of autonomous agents
J Hoscilowicz, B Maj, B Kozakiewicz… - arxiv preprint arxiv …, 2024 - arxiv.org
With the growing reliance on digital devices equipped with graphical user interfaces (GUIs),
such as computers and smartphones, the need for effective automation tools has become …
such as computers and smartphones, the need for effective automation tools has become …
OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents
This paper presents OmniJARVIS, a novel Vision-Language-Action (VLA) model for open-
world instruction-following agents in Minecraft. Compared to prior works that either emit …
world instruction-following agents in Minecraft. Compared to prior works that either emit …
Autonomous Mental Development at the Individual and Collective Levels: Concept and Challenges
The increasing complexity and unpredictability of many ICT scenarios let us envision that
future systems will have to dynamically learn how to act and adapt to face evolving situations …
future systems will have to dynamically learn how to act and adapt to face evolving situations …
A taxonomy of architecture options for foundation model-based agents: Analysis and decision model
The rapid advancement of AI technology has led to widespread applications of agent
systems across various domains. However, the need for detailed architecture design poses …
systems across various domains. However, the need for detailed architecture design poses …
AI to publish knowledge: a tectonic shift
T Lemberger - EMBO reports, 2024 - embopress.org
The rise of generative AI will transform scientific publishing but it also poses risks. While AI
enables the dissemination of knowledge in computable form, preserving transparency and …
enables the dissemination of knowledge in computable form, preserving transparency and …
Position: Foundation Agents as the Paradigm Shift for Decision Making
Decision making demands intricate interplay between perception, memory, and reasoning to
discern optimal policies. Conventional approaches to decision making face challenges …
discern optimal policies. Conventional approaches to decision making face challenges …
Smart Mobility with Agent-based Foundation Models: Towards Interactive and Collaborative Intelligent Vehicles
This letter reports the insights gained during a Distributed/Decentralized Hybrid Workshop
on Foundation/Infrastructure Intelligence (FII), where we discussed the evolving role of …
on Foundation/Infrastructure Intelligence (FII), where we discussed the evolving role of …
Large Physics Models: Towards a collaborative approach with Large Language Models and Foundation Models
This paper explores ideas and provides a potential roadmap for the development and
evaluation of physics-specific large-scale AI models, which we call Large Physics Models …
evaluation of physics-specific large-scale AI models, which we call Large Physics Models …