Autonomous medical evaluation for guideline adherence of large language models

D Fast, LC Adams, F Busch, C Fallon, M Huppertz… - NPJ Digital …, 2024 - nature.com
Abstract Autonomous Medical Evaluation for Guideline Adherence (AMEGA) is a
comprehensive benchmark designed to evaluate large language models' adherence to …

Rationale-Guided Retrieval Augmented Generation for Medical Question Answering

J Sohn, Y Park, C Yoon, S Park, H Hwang… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models (LLM) hold significant potential for applications in biomedicine, but
they struggle with hallucinations and outdated knowledge. While retrieval-augmented …

Multi-hop Evidence Pursuit Meets the Web: Team Papelo at FEVER 2024

C Malon - arxiv preprint arxiv:2411.05762, 2024 - arxiv.org
Separating disinformation from fact on the web has long challenged both the search and the
reasoning powers of humans. We show that the reasoning power of large language models …

Retrieving Semantics for Fact-Checking: A Comparative Approach using CQ (Claim to Question) & AQ (Answer to Question)

N Urbani, S Modha, G Pasi - … of the Seventh Fact Extraction and …, 2024 - aclanthology.org
Fact-checking using evidences is the preferred way to tackle the issue of misinformation in
the society. The democratization of information through social media has accelerated the …

RARE: Retrieval-Augmented Reasoning Enhancement for Large Language Models

H Tran, Z Yao, J Wang, Y Zhang, Z Yang… - arxiv preprint arxiv …, 2024 - arxiv.org
This work introduces RARE (Retrieval-Augmented Reasoning Enhancement), a versatile
extension to the mutual reasoning framework (rStar), aimed at enhancing reasoning …

Session Introduction: AI and Machine Learning in Clinical Medicine: Generative and Interactive Systems at the Human-Machine Interface

F Nateghi Haredasht, D Kim, JD Romano… - … 2025: Proceedings of …, 2024 - World Scientific
Artificial Intelligence (AI) technologies are increasingly capable of processing complex and
multilayered datasets. Innovations in generative AI and deep learning have notably …

Embracing Foundation Models for Advancing Scientific Discovery

S Guo, AH Shariatmadari, G **ong… - 2024 IEEE International …, 2024 - ieeexplore.ieee.org
Machine learning foundation models, particularly large language models (LLMs) such as
GPT-4o, have revolutionized traditional applications in computer vision and natural …

GuidelineGuard: An Agentic Framework for Medical Note Evaluation with Guideline Adherence

MD Shahriyear - arxiv preprint arxiv:2411.06264, 2024 - arxiv.org
Although rapid advancements in Large Language Models (LLMs) are facilitating the
integration of artificial intelligence-based applications and services in healthcare, limited …