Opportunities and challenges for ChatGPT and large language models in biomedicine and health
ChatGPT has drawn considerable attention from both the general public and domain experts
with its remarkable text generation capabilities. This has subsequently led to the emergence …
with its remarkable text generation capabilities. This has subsequently led to the emergence …
A survey of large language models
Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …
Benchmarking large language models on cmexam-a comprehensive chinese medical exam dataset
Recent advancements in large language models (LLMs) have transformed the field of
question answering (QA). However, evaluating LLMs in the medical field is challenging due …
question answering (QA). However, evaluating LLMs in the medical field is challenging due …
Can llms augment low-resource reading comprehension datasets? opportunities and challenges
Large Language Models (LLMs) have demonstrated impressive zero shot performance on a
wide range of NLP tasks, demonstrating the ability to reason and apply commonsense. A …
wide range of NLP tasks, demonstrating the ability to reason and apply commonsense. A …
The dawn after the dark: An empirical study on factuality hallucination in large language models
In the era of large language models (LLMs), hallucination (ie, the tendency to generate
factually incorrect content) poses great challenge to trustworthy and reliable deployment of …
factually incorrect content) poses great challenge to trustworthy and reliable deployment of …
[PDF][PDF] Linguistic calibration of longform generations
Abstract Language models (LMs) may lead their users to make suboptimal downstream
decisions when they confidently hallucinate. This issue can be mitigated by having the LM …
decisions when they confidently hallucinate. This issue can be mitigated by having the LM …
Is ChatGPT a biomedical expert?--exploring the zero-shot performance of current GPT models in biomedical tasks
We assessed the performance of commercial Large Language Models (LLMs) GPT-3.5-
Turbo and GPT-4 on tasks from the 2023 BioASQ challenge. In Task 11b Phase B, which is …
Turbo and GPT-4 on tasks from the 2023 BioASQ challenge. In Task 11b Phase B, which is …
Overview of bioasq 2023: The eleventh bioasq challenge on large-scale biomedical semantic indexing and question answering
This is an overview of the eleventh edition of the BioASQ challenge in the context of the
Conference and Labs of the Evaluation Forum (CLEF) 2023. BioASQ is a series of …
Conference and Labs of the Evaluation Forum (CLEF) 2023. BioASQ is a series of …
Lab-bench: Measuring capabilities of language models for biology research
There is widespread optimism that frontier Large Language Models (LLMs) and LLM-
augmented systems have the potential to rapidly accelerate scientific discovery across …
augmented systems have the potential to rapidly accelerate scientific discovery across …
Halueval-wild: Evaluating hallucinations of language models in the wild
Hallucinations pose a significant challenge to the reliability of large language models
(LLMs) in critical domains. Recent benchmarks designed to assess LLM hallucinations …
(LLMs) in critical domains. Recent benchmarks designed to assess LLM hallucinations …