Superhuman performance of a large language model on the reasoning tasks of a physician

PG Brodeur, TA Buckley, Z Kanjee, E Goh… - arxiv preprint arxiv …, 2024 - arxiv.org
Performance of large language models (LLMs) on medical tasks has traditionally been
evaluated using multiple choice question benchmarks. However, such benchmarks are …

Application of large language models in disease diagnosis and treatment

X Yang, T Li, Q Su, Y Liu, C Kang, Y Lyu… - Chinese Medical …, 2025 - mednexus.org
Large language models (LLMs) such as ChatGPT, Claude, Llama, and Qwen are emerging
as transformative technologies for the diagnosis and treatment of various diseases. With …