Agent lumos: Unified and modular training for open-source language agents
Closed-source agents suffer from several issues such as a lack of affordability, transparency,
and reproducibility, particularly on complex interactive tasks. This motivates the …
and reproducibility, particularly on complex interactive tasks. This motivates the …
Lumos: Learning agents with unified data, modular design, and open-source llms
We introduce Lumos, a novel framework for training language agents that employs a unified
data format and a modular architecture based on open-source large language models …
data format and a modular architecture based on open-source large language models …
MacGyver: Are Large Language Models Creative Problem Solvers?
We explore the creative problem-solving capabilities of modern large language models
(LLMs) in a constrained setting. The setting requires circumventing a cognitive bias known in …
(LLMs) in a constrained setting. The setting requires circumventing a cognitive bias known in …
Tasklama: probing the complex task understanding of language models
Structured Complex Task Decomposition (SCTD) is the problem of breaking down a
complex real-world task (such as planning a wedding) into a directed acyclic graph over …
complex real-world task (such as planning a wedding) into a directed acyclic graph over …
STEER: Unified Style Transfer with Expert Reinforcement
While text style transfer has many applications across natural language processing, the core
premise of transferring from a single source style is unrealistic in a real-world setting. In this …
premise of transferring from a single source style is unrealistic in a real-world setting. In this …
Why Not Use Your Textbook? Knowledge-Enhanced Procedure Planning of Instructional Videos
In this paper we explore the capability of an agent to construct a logical sequence of action
steps thereby assembling a strategic procedural plan. This plan is crucial for navigating from …
steps thereby assembling a strategic procedural plan. This plan is crucial for navigating from …
Geometric-averaged preference optimization for soft preference labels
Many algorithms for aligning LLMs with human preferences assume that human preferences
are binary and deterministic. However, human preferences can vary across individuals, and …
are binary and deterministic. However, human preferences can vary across individuals, and …
CAT-BENCH: Benchmarking Language Model Understanding of Causal and Temporal Dependencies in Plans
Understanding the abilities of LLMs to reason about natural language plans, such as
instructional text and recipes, is critical to reliably using them in decision-making systems. A …
instructional text and recipes, is critical to reliably using them in decision-making systems. A …
E2CL: Exploration-based Error Correction Learning for Embodied Agents
Language models are exhibiting increasing capability in knowledge utilization and
reasoning. However, when applied as agents in embodied environments, they often suffer …
reasoning. However, when applied as agents in embodied environments, they often suffer …
Do large language models and humans have similar behaviors in causal inference with script knowledge?
Recently, large pre-trained language models (LLMs) have demonstrated superior language
understanding abilities, including zero-shot causal reasoning. However, it is unclear to what …
understanding abilities, including zero-shot causal reasoning. However, it is unclear to what …