Large language model-brained gui agents: A survey
GUIs have long been central to human-computer interaction, providing an intuitive and
visually-driven way to access and interact with digital systems. The advent of LLMs …
visually-driven way to access and interact with digital systems. The advent of LLMs …
A human-inspired reading agent with gist memory of very long contexts
Current Large Language Models (LLMs) are not only limited to some maximum context
length, but also are not able to robustly consume long inputs. To address these limitations …
length, but also are not able to robustly consume long inputs. To address these limitations …
Tur [k] ingbench: A challenge benchmark for web agents
Can advanced multi-modal models effectively tackle complex web-based tasks? Such tasks
are often found on crowdsourcing platforms, where crowdworkers engage in challenging …
are often found on crowdsourcing platforms, where crowdworkers engage in challenging …
AI Agents for Computer Use: A Review of Instruction-based Computer Control, GUI Automation, and Operator Assistants
Instruction-based computer control agents (CCAs) execute complex action sequences on
personal computers or mobile devices to fulfill tasks using the same graphical user …
personal computers or mobile devices to fulfill tasks using the same graphical user …
Meta-task planning for language agents
The rapid advancement of neural language models has sparked a new surge of intelligent
agent research. Unlike traditional agents, large language model-based agents (LLM agents) …
agent research. Unlike traditional agents, large language model-based agents (LLM agents) …
Gui agents: A survey
Graphical User Interface (GUI) agents, powered by Large Foundation Models, have
emerged as a transformative approach to automating human-computer interaction. These …
emerged as a transformative approach to automating human-computer interaction. These …
Infrastructure for AI Agents
Increasingly many AI systems can plan and execute interactions in open-ended
environments, such as making phone calls or buying online goods. As developers grow the …
environments, such as making phone calls or buying online goods. As developers grow the …
Planning with Multi-Constraints via Collaborative Language Agents
The rapid advancement of neural language models has sparked a new surge of intelligent
agent research. Unlike traditional agents, large language model-based agents (LLM agents) …
agent research. Unlike traditional agents, large language model-based agents (LLM agents) …
Generalizing to any diverse distribution: uniformity, gentle finetuning and rebalancing
As training datasets grow larger, we aspire to develop models that generalize well to any
diverse test distribution, even if the latter deviates significantly from the training data. Various …
diverse test distribution, even if the latter deviates significantly from the training data. Various …
[PDF][PDF] Os agents: A survey on mllm-based agents for general computing devices use
The dream to create AI assistants as capable and versatile as the fictional JARVIS from Iron
Man has long captivated imaginations. With the evolution of (multimodal) large language …
Man has long captivated imaginations. With the evolution of (multimodal) large language …