A Survey on Uncertainty Quantification of Large Language Models: Taxonomy, Open Research Challenges, and Future Directions

O Shorinwa, Z Mei, J Lidard, AZ Ren… - arxiv preprint arxiv …, 2024 - arxiv.org
The remarkable performance of large language models (LLMs) in content generation,
coding, and common-sense reasoning has spurred widespread integration into many facets …

Rethinking Uncertainty Estimation in Natural Language Generation

L Aichberger, K Schweighofer, S Hochreiter - arxiv preprint arxiv …, 2024 - arxiv.org
Large Language Models (LLMs) are increasingly employed in real-world applications,
driving the need to evaluate the trustworthiness of their generated text. To this end, reliable …

Calibrating Large Language Models Using Their Generations Only

D Ulmer, M Gubri, H Lee, S Yun, SJ Oh - arxiv preprint arxiv:2403.05973, 2024 - arxiv.org
As large language models (LLMs) are increasingly deployed in user-facing applications,
building trust and maintaining safety by accurately quantifying a model's confidence in its …

Rag-check: Evaluating multimodal retrieval augmented generation performance

M Mortaheb, MAA Khojastepour… - arxiv preprint arxiv …, 2025 - arxiv.org
Retrieval-augmented generation (RAG) improves large language models (LLMs) by using
external knowledge to guide response generation, reducing hallucinations. However, RAG …

Are LLM-judges robust to expressions of uncertainty? investigating the effect of epistemic markers on LLM-based evaluation

D Lee, Y Hwang, Y Kim, J Park, K Jung - arxiv preprint arxiv:2410.20774, 2024 - arxiv.org
In line with the principle of honesty, there has been a growing effort to train large language
models (LLMs) to generate outputs containing epistemic markers. However, evaluation in …

Graph-based Confidence Calibration for Large Language Models

Y Li, S Wang, L Huang, LP Liu - arxiv preprint arxiv:2411.02454, 2024 - arxiv.org
One important approach to improving the reliability of large language models (LLMs) is to
provide accurate confidence estimations regarding the correctness of their answers …

Label-Confidence-Aware Uncertainty Estimation in Natural Language Generation

Q Lin, L Zhou, Z Yang, Y Cai - arxiv preprint arxiv:2412.07255, 2024 - arxiv.org
Large Language Models (LLMs) display formidable capabilities in generative tasks but also
pose potential risks due to their tendency to generate hallucinatory responses. Uncertainty …

Large Language Model Uncertainty Measurement and Calibration for Medical Diagnosis and Treatment

T Savage, J Wang, R Gallo, A Boukil, V Patel… - medRxiv, 2024 - medrxiv.org
Introduction The inability for Large Language Models (LLMs) to communicate uncertainty is
a significant barrier to their use in medicine. Before LLMs can be integrated into patient care …

Lookers-On See Most of the Game: An External Insight-Guided Method for Enhancing Uncertainty Estimation

R Li, J Long, M Qi, L Sha, P Wang, H **a, Z Sui - openreview.net
Large Language Models (LLMs) have gained increasing attention for their impressive
capabilities, alongside concerns about the reliability arising from their potential to generate …