Language models don't always say what they think: unfaithful explanations in chain-of-thought prompting

M Turpin, J Michael, E Perez… - Advances in Neural …, 2024‏ - proceedings.neurips.cc
Abstract Large Language Models (LLMs) can achieve strong performance on many tasks by
producing step-by-step reasoning before giving a final output, often referred to as chain-of …

Siren's song in the AI ocean: a survey on hallucination in large language models

Y Zhang, Y Li, L Cui, D Cai, L Liu, T Fu… - arxiv preprint arxiv …, 2023‏ - arxiv.org
While large language models (LLMs) have demonstrated remarkable capabilities across a
range of downstream tasks, a significant concern revolves around their propensity to exhibit …

Whose opinions do language models reflect?

S Santurkar, E Durmus, F Ladhak… - International …, 2023‏ - proceedings.mlr.press
Abstract Language models (LMs) are increasingly being used in open-ended contexts,
where the opinions they reflect in response to subjective queries can have a profound …

Inference-time intervention: Eliciting truthful answers from a language model

K Li, O Patel, F Viégas, H Pfister… - Advances in Neural …, 2024‏ - proceedings.neurips.cc
Abstract We introduce Inference-Time Intervention (ITI), a technique designed to enhance
the" truthfulness" of large language models (LLMs). ITI operates by shifting model activations …

Trustllm: Trustworthiness in large language models

Y Huang, L Sun, H Wang, S Wu, Q Zhang, Y Li… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Large language models (LLMs), exemplified by ChatGPT, have gained considerable
attention for their excellent natural language processing capabilities. Nonetheless, these …

A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions

L Huang, W Yu, W Ma, W Zhong, Z Feng… - ACM Transactions on …, 2024‏ - dl.acm.org
The emergence of large language models (LLMs) has marked a significant breakthrough in
natural language processing (NLP), fueling a paradigm shift in information acquisition …