Evaluation and mitigation of the limitations of large language models in clinical decision-making

P Hager, F Jungmann, R Holland, K Bhagat… - Nature medicine, 2024 - nature.com
Clinical decision-making is one of the most impactful parts of a physician's responsibilities
and stands to benefit greatly from artificial intelligence solutions and large language models …

Rlaif: Scaling reinforcement learning from human feedback with ai feedback

H Lee, S Phatale, H Mansoor, KR Lu, T Mesnard… - 2023 - openreview.net
Reinforcement learning from human feedback (RLHF) is an effective technique for aligning
large language models (LLMs) to human preferences, but gathering high-quality human …

Can generalist foundation models outcompete special-purpose tuning? case study in medicine

H Nori, YT Lee, S Zhang, D Carignan, R Edgar… - arxiv preprint arxiv …, 2023 - arxiv.org
Generalist foundation models such as GPT-4 have displayed surprising capabilities in a
wide variety of domains and tasks. Yet, there is a prevalent assumption that they cannot …

Evaluating large language models at evaluating instruction following

Z Zeng, J Yu, T Gao, Y Meng, T Goyal… - arxiv preprint arxiv …, 2023 - arxiv.org
As research in large language models (LLMs) continues to accelerate, LLM-based
evaluation has emerged as a scalable and cost-effective alternative to human evaluations …

Do llms exhibit human-like response biases? a case study in survey design

L Tjuatja, V Chen, T Wu, A Talwalkwar… - Transactions of the …, 2024 - direct.mit.edu
One widely cited barrier to the adoption of LLMs as proxies for humans in subjective tasks is
their sensitivity to prompt wording—but interestingly, humans also display sensitivities to …

Preference learning algorithms do not learn preference rankings

A Chen, S Malladi, L Zhang, X Chen… - Advances in …, 2025 - proceedings.neurips.cc
Preference learning algorithms (eg, RLHF and DPO) are frequently used to steer LLMs to
produce generations that are more preferred by humans, but our understanding of their …

Rlaif vs. rlhf: Scaling reinforcement learning from human feedback with ai feedback

H Lee, S Phatale, H Mansoor, T Mesnard… - arxiv preprint arxiv …, 2023 - arxiv.org
Reinforcement learning from human feedback (RLHF) has proven effective in aligning large
language models (LLMs) with human preferences, but gathering high-quality preference …

[PDF][PDF] The prompt report: A systematic survey of prompting techniques

S Schulhoff, M Ilie, N Balepur… - arxiv preprint …, 2024 - readwise-assets.s3.amazonaws.com
Abstract Generative Artificial Intelligence (GenAI) systems are being increasingly deployed
across all parts of industry and research settings. Developers and end users interact with …

Introducing v0. 5 of the ai safety benchmark from mlcommons

B Vidgen, A Agrawal, AM Ahmed, V Akinwande… - arxiv preprint arxiv …, 2024 - arxiv.org
This paper introduces v0. 5 of the AI Safety Benchmark, which has been created by the
MLCommons AI Safety Working Group. The AI Safety Benchmark has been designed to …

A survey on stability of learning with limited labelled data and its sensitivity to the effects of randomness

B Pecher, I Srba, M Bielikova - ACM Computing Surveys, 2024 - dl.acm.org
Learning with limited labelled data, such as prompting, in-context learning, fine-tuning, meta-
learning, or few-shot learning, aims to effectively train a model using only a small amount of …