A probabilistic framework for llm hallucination detection via belief tree propagation

B Hou, Y Zhang, J Andreas, S Chang - arxiv preprint arxiv:2406.06950, 2024 - arxiv.org
This paper focuses on the task of hallucination detection, which aims to determine the
truthfulness of LLM-generated statements. To address this problem, a popular class of …

Embedding and Gradient Say Wrong: A White-Box Method for Hallucination Detection

X Hu, Y Zhang, R Peng, H Zhang, C Wu… - Proceedings of the …, 2024 - aclanthology.org
In recent years, large language models (LLMs) have achieved remarkable success in the
field of natural language generation. Compared to previous small-scale models, they are …

Benchmarking LLMs in Political Content Text-Annotation: Proof-of-Concept with Toxicity and Incivility Data

B González-Bustamante - arxiv preprint arxiv:2409.09741, 2024 - arxiv.org
This article benchmarked the ability of OpenAI's GPTs and a number of open-source LLMs to
perform annotation tasks on political content. We used a novel protest event dataset …

Decoding Knowledge in Large Language Models: A Framework for Categorization and Comprehension

Y Fang, R Tang - arxiv preprint arxiv:2501.01332, 2025 - arxiv.org
Understanding how large language models (LLMs) acquire, retain, and apply knowledge
remains an open challenge. This paper introduces a novel framework, K-(CSA)^ 2, which …

Anah-v2: Scaling analytical hallucination annotation of large language models

Y Gu, Z Ji, W Zhang, C Lyu, D Lin, K Chen - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models (LLMs) exhibit hallucinations in long-form question-answering tasks
across various domains and wide applications. Current hallucination detection and …