[HTML][HTML] Empowering biomedical discovery with AI agents

S Gao, A Fang, Y Huang, V Giunchiglia, A Noori… - Cell, 2024 - cell.com
We envision" AI scientists" as systems capable of skeptical learning and reasoning that
empower biomedical research through collaborative agents that integrate AI models and …

The llama 3 herd of models

A Dubey, A Jauhri, A Pandey, A Kadian… - arxiv preprint arxiv …, 2024 - arxiv.org
Modern artificial intelligence (AI) systems are powered by foundation models. This paper
presents a new set of foundation models, called Llama 3. It is a herd of language models …

Mmmu: A massive multi-discipline multimodal understanding and reasoning benchmark for expert agi

X Yue, Y Ni, K Zhang, T Zheng, R Liu… - Proceedings of the …, 2024 - openaccess.thecvf.com
We introduce MMMU: a new benchmark designed to evaluate multimodal models on
massive multi-discipline tasks demanding college-level subject knowledge and deliberate …

Visualwebarena: Evaluating multimodal agents on realistic visual web tasks

JY Koh, R Lo, L Jang, V Duvvur, MC Lim… - arxiv preprint arxiv …, 2024 - arxiv.org
Autonomous agents capable of planning, reasoning, and executing actions on the web offer
a promising avenue for automating computer tasks. However, the majority of existing …

Towards general computer control: A multimodal agent for red dead redemption ii as a case study

W Tan, Z Ding, W Zhang, B Li, B Zhou… - ICLR 2024 Workshop …, 2024 - openreview.net
Despite the success in specific tasks and scenarios, existing foundation agents, empowered
by large models (LMs) and advanced tools, still cannot generalize to different scenarios …

Open-Ethical AI: Advancements in Open-Source Human-Centric Neural Language Models

S Sicari, JF Cevallos M, A Rizzardi… - ACM Computing …, 2024 - dl.acm.org
This survey summarises the most recent methods for building and assessing helpful, honest,
and harmless neural language models, considering small, medium, and large-size models …

Glore: When, where, and how to improve llm reasoning via global and local refinements

A Havrilla, S Raparthy, C Nalmpantis… - arxiv preprint arxiv …, 2024 - arxiv.org
State-of-the-art language models can exhibit impressive reasoning refinement capabilities
on math, science or coding tasks. However, recent work demonstrates that even the best …

Language agents as optimizable graphs

M Zhuge, W Wang, L Kirsch, F Faccio… - arxiv preprint arxiv …, 2024 - arxiv.org
Various human-designed prompt engineering techniques have been proposed to improve
problem solvers based on Large Language Models (LLMs), yielding many disparate code …

m &m's: A Benchmark to Evaluate Tool-Use for multi-step multi-modal Tasks

Z Ma, W Huang, J Zhang, T Gupta… - European Conference on …, 2024 - Springer
Real-world multi-modal problems are rarely solved by a single machine learning model, and
often require multi-step computational plans that involve stitching several models. Tool …

Os-copilot: Towards generalist computer agents with self-improvement

Z Wu, C Han, Z Ding, Z Weng, Z Liu, S Yao… - arxiv preprint arxiv …, 2024 - arxiv.org
Autonomous interaction with the computer has been a longstanding challenge with great
potential, and the recent proliferation of large language models (LLMs) has markedly …