- Academic Search

S Tian, Q **, L Yeganova, PT Lai, Q Zhu… - Briefings in …, 2024 - academic.oup.com

ChatGPT has drawn considerable attention from both the general public and domain experts
with its remarkable text generation capabilities. This has subsequently led to the emergence …

Save Cite Cited by 208 Related articles All 13 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

A survey of large language models

WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou… - arxiv preprint arxiv …, 2023 - arxiv.org

Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …

Save Cite Cited by 3540 Related articles All 4 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

Benchmarking large language models on cmexam-a comprehensive chinese medical exam dataset

J Liu, P Zhou, Y Hua, D Chong, Z Tian… - Advances in …, 2024 - proceedings.neurips.cc

Recent advancements in large language models (LLMs) have transformed the field of
question answering (QA). However, evaluating LLMs in the medical field is challenging due …

Save Cite Cited by 73 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Can llms augment low-resource reading comprehension datasets? opportunities and challenges

V Samuel, H Aynaou, AG Chowdhury… - arxiv preprint arxiv …, 2023 - arxiv.org

Large Language Models (LLMs) have demonstrated impressive zero shot performance on a
wide range of NLP tasks, demonstrating the ability to reason and apply commonsense. A …

Save Cite Cited by 11 Related articles All 4 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

The dawn after the dark: An empirical study on factuality hallucination in large language models

J Li, J Chen, R Ren, X Cheng, WX Zhao, JY Nie… - arxiv preprint arxiv …, 2024 - arxiv.org

In the era of large language models (LLMs), hallucination (ie, the tendency to generate
factually incorrect content) poses great challenge to trustworthy and reliable deployment of …

Save Cite Cited by 64 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] researchhub.com

[PDF][PDF] Linguistic calibration of longform generations

N Band, X Li, T Ma… - Forty-first …, 2024 - storage.prod.researchhub.com

Abstract Language models (LMs) may lead their users to make suboptimal downstream
decisions when they confidently hallucinate. This issue can be mitigated by having the LM …

Save Cite Cited by 11 Related articles View as HTML

[Free GPT-4]

[PDF] arxiv.org

Is ChatGPT a biomedical expert?--exploring the zero-shot performance of current GPT models in biomedical tasks

S Ateia, U Kruschwitz - arxiv preprint arxiv:2306.16108, 2023 - arxiv.org

We assessed the performance of commercial Large Language Models (LLMs) GPT-3.5-
Turbo and GPT-4 on tasks from the 2023 BioASQ challenge. In Task 11b Phase B, which is …

Save Cite Cited by 35 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Overview of bioasq 2023: The eleventh bioasq challenge on large-scale biomedical semantic indexing and question answering

A Nentidis, G Katsimpras, A Krithara… - … Conference of the Cross …, 2023 - Springer

This is an overview of the eleventh edition of the BioASQ challenge in the context of the
Conference and Labs of the Evaluation Forum (CLEF) 2023. BioASQ is a series of …

Save Cite Cited by 35 Related articles All 6 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Lab-bench: Measuring capabilities of language models for biology research

JM Laurent, JD Janizek, M Ruzo, MM Hinks… - arxiv preprint arxiv …, 2024 - arxiv.org

There is widespread optimism that frontier Large Language Models (LLMs) and LLM-
augmented systems have the potential to rapidly accelerate scientific discovery across …

Save Cite Cited by 12 Related articles View as HTML

[Free GPT-4]

[PDF] arxiv.org

Halueval-wild: Evaluating hallucinations of language models in the wild

Z Zhu, Y Yang, Z Sun - arxiv preprint arxiv:2403.04307, 2024 - arxiv.org

Hallucinations pose a significant challenge to the reliability of large language models
(LLMs) in critical domains. Recent benchmarks designed to assess LLM hallucinations …

Save Cite Cited by 7 Related articles All 2 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

BioASQ-QA: A manually curated corpus for Biomedical Question Answering

Opportunities and challenges for ChatGPT and large language models in biomedicine and health

A survey of large language models

Benchmarking large language models on cmexam-a comprehensive chinese medical exam dataset

Can llms augment low-resource reading comprehension datasets? opportunities and challenges

The dawn after the dark: An empirical study on factuality hallucination in large language models

[PDF][PDF] Linguistic calibration of longform generations

Is ChatGPT a biomedical expert?--exploring the zero-shot performance of current GPT models in biomedical tasks

Overview of bioasq 2023: The eleventh bioasq challenge on large-scale biomedical semantic indexing and question answering

Lab-bench: Measuring capabilities of language models for biology research

Halueval-wild: Evaluating hallucinations of language models in the wild