Narrowing the knowledge evaluation gap: Open-domain question answering with multi-granularity answers
Factual questions typically can be answered correctly at different levels of granularity. For
example, both``August 4, 1961''and``1961''are correct answers to the question``When was …
example, both``August 4, 1961''and``1961''are correct answers to the question``When was …
Conformal Alignment: Knowing When to Trust Foundation Models with Guarantees
Before deploying outputs from foundation models in high-stakes tasks, it is imperative to
ensure that they align with human values. For instance, in radiology report generation …
ensure that they align with human values. For instance, in radiology report generation …
LUQ: Long-text Uncertainty Quantification for LLMs
Large Language Models (LLMs) have demonstrated remarkable capability in a variety of
NLP tasks. Despite their effectiveness, these models are prone to generate nonfactual …
NLP tasks. Despite their effectiveness, these models are prone to generate nonfactual …
[PDF][PDF] Large Language Models as an active Bayesian filter: information acquisition and integration
Abstract This study investigates Large Language Models (LLMs) as dynamic Bayesian filters
through question-asking experiments inspired by cognitive science. We analyse LLMs' …
through question-asking experiments inspired by cognitive science. We analyse LLMs' …