Tool learning with foundation models
Humans possess an extraordinary ability to create and utilize tools. With the advent of
foundation models, artificial intelligence systems have the potential to be equally adept in …
foundation models, artificial intelligence systems have the potential to be equally adept in …
Mind2web: Towards a generalist agent for the web
Abstract We introduce Mind2Web, the first dataset for develo** and evaluating generalist
agents for the web that can follow language instructions to complete complex tasks on any …
agents for the web that can follow language instructions to complete complex tasks on any …
Language models can solve computer tasks
Agents capable of carrying out general tasks on a computer can improve efficiency and
productivity by automating repetitive tasks and assisting in complex problem-solving. Ideally …
productivity by automating repetitive tasks and assisting in complex problem-solving. Ideally …
Webarena: A realistic web environment for building autonomous agents
With advances in generative AI, there is now potential for autonomous agents to manage
daily tasks via natural language commands. However, current agents are primarily created …
daily tasks via natural language commands. However, current agents are primarily created …
Androidinthewild: A large-scale dataset for android device control
C Rawles, A Li, D Rodriguez… - Advances in Neural …, 2023 - proceedings.neurips.cc
There is a growing interest in device-control systems that can interpret human natural
language instructions and execute them on a digital device by directly controlling its user …
language instructions and execute them on a digital device by directly controlling its user …
Autogen: Enabling next-gen llm applications via multi-agent conversation
AutoGen is an open-source framework that allows developers to build LLM applications via
multiple agents that can converse with each other to accomplish tasks. AutoGen agents are …
multiple agents that can converse with each other to accomplish tasks. AutoGen agents are …
Personal llm agents: Insights and survey about the capability, efficiency and security
Since the advent of personal computing devices, intelligent personal assistants (IPAs) have
been one of the key technologies that researchers and engineers have focused on, aiming …
been one of the key technologies that researchers and engineers have focused on, aiming …
Gpt-4v (ision) is a generalist web agent, if grounded
The recent development on large multimodal models (LMMs), especially GPT-4V (ision) and
Gemini, has been quickly expanding the capability boundaries of multimodal models …
Gemini, has been quickly expanding the capability boundaries of multimodal models …
Webshop: Towards scalable real-world web interaction with grounded language agents
Most existing benchmarks for grounding language in interactive environments either lack
realistic linguistic elements, or prove difficult to scale up due to substantial human …
realistic linguistic elements, or prove difficult to scale up due to substantial human …
Language agent tree search unifies reasoning acting and planning in language models
While language models (LMs) have shown potential across a range of decision-making
tasks, their reliance on simple acting processes limits their broad deployment as …
tasks, their reliance on simple acting processes limits their broad deployment as …