[HTML][HTML] Empowering biomedical discovery with AI agents
We envision" AI scientists" as systems capable of skeptical learning and reasoning that
empower biomedical research through collaborative agents that integrate AI models and …
empower biomedical research through collaborative agents that integrate AI models and …
The llama 3 herd of models
Modern artificial intelligence (AI) systems are powered by foundation models. This paper
presents a new set of foundation models, called Llama 3. It is a herd of language models …
presents a new set of foundation models, called Llama 3. It is a herd of language models …
Mmmu: A massive multi-discipline multimodal understanding and reasoning benchmark for expert agi
We introduce MMMU: a new benchmark designed to evaluate multimodal models on
massive multi-discipline tasks demanding college-level subject knowledge and deliberate …
massive multi-discipline tasks demanding college-level subject knowledge and deliberate …
Visualwebarena: Evaluating multimodal agents on realistic visual web tasks
Autonomous agents capable of planning, reasoning, and executing actions on the web offer
a promising avenue for automating computer tasks. However, the majority of existing …
a promising avenue for automating computer tasks. However, the majority of existing …
Towards general computer control: A multimodal agent for red dead redemption ii as a case study
Despite the success in specific tasks and scenarios, existing foundation agents, empowered
by large models (LMs) and advanced tools, still cannot generalize to different scenarios …
by large models (LMs) and advanced tools, still cannot generalize to different scenarios …
Open-Ethical AI: Advancements in Open-Source Human-Centric Neural Language Models
This survey summarises the most recent methods for building and assessing helpful, honest,
and harmless neural language models, considering small, medium, and large-size models …
and harmless neural language models, considering small, medium, and large-size models …
Glore: When, where, and how to improve llm reasoning via global and local refinements
A Havrilla, S Raparthy, C Nalmpantis… - arxiv preprint arxiv …, 2024 - arxiv.org
State-of-the-art language models can exhibit impressive reasoning refinement capabilities
on math, science or coding tasks. However, recent work demonstrates that even the best …
on math, science or coding tasks. However, recent work demonstrates that even the best …
Language agents as optimizable graphs
Various human-designed prompt engineering techniques have been proposed to improve
problem solvers based on Large Language Models (LLMs), yielding many disparate code …
problem solvers based on Large Language Models (LLMs), yielding many disparate code …
m &m's: A Benchmark to Evaluate Tool-Use for multi-step multi-modal Tasks
Real-world multi-modal problems are rarely solved by a single machine learning model, and
often require multi-step computational plans that involve stitching several models. Tool …
often require multi-step computational plans that involve stitching several models. Tool …
Os-copilot: Towards generalist computer agents with self-improvement
Autonomous interaction with the computer has been a longstanding challenge with great
potential, and the recent proliferation of large language models (LLMs) has markedly …
potential, and the recent proliferation of large language models (LLMs) has markedly …