Internal consistency and self-feedback in large language models: A survey

X Liang, S Song, Z Zheng, H Wang, Q Yu, X Li… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models (LLMs) often exhibit deficient reasoning or generate hallucinations.
To address these, studies prefixed with" Self-" such as Self-Consistency, Self-Improve, and …

GRAIT: Gradient-Driven Refusal-Aware Instruction Tuning for Effective Hallucination Mitigation

R Zhu, Z Jiang, J Wu, Z Ma, J Song, F Bai, D Lin… - arxiv preprint arxiv …, 2025 - arxiv.org
Refusal-Aware Instruction Tuning (RAIT) aims to enhance Large Language Models (LLMs)
by improving their ability to refuse responses to questions beyond their knowledge, thereby …

Tell Me What You Don't Know: Enhancing Refusal Capabilities of Role-Playing Agents via Representation Space Analysis and Editing

W Liu, S An, J Lu, M Wu, T Li, X Wang, X Zheng… - arxiv preprint arxiv …, 2024 - arxiv.org
Role-Playing Agents (RPAs) have shown remarkable performance in various applications,
yet they often struggle to recognize and appropriately respond to hard queries that conflict …

Drawing the Line: Enhancing Trustworthiness of MLLMs Through the Power of Refusal

Y Wang, Z Zhu, H Liu, Y Liao, H Liu, Y Wang… - arxiv preprint arxiv …, 2024 - arxiv.org
Multimodal large language models (MLLMs) excel at multimodal perception and
understanding, yet their tendency to generate hallucinated or inaccurate responses …

Entropy Reveals What You Know: An Entropy-Guided Method for Enhancing the Reliability of Large Language Models

XQ Han, R Li, Z Yan, JZ Pan - openreview.net
While large language models (LLMs) encode vast amounts of knowledge within their
parameters for some mainstream entities, factual inconsistencies and untruthfulness in LLMs …

A Survey of Hallucination Problems Based on Large Language Models

X Liu - Applied and Computational Engineering, 2024 - ewadirect.com
Large language models (LLM) have made significant achievements in the field of natural
language processing, but the generated text often contains content that is inconsistent with …