Quantitative text analysis

KL Nielbo, F Karsdorp, M Wevers, A Lassche… - Nature Reviews …, 2024 - nature.com
Text analysis has undergone substantial evolution since its inception, moving from manual
qualitative assessments to sophisticated quantitative and computational methods. Beginning …

Siren's song in the AI ocean: a survey on hallucination in large language models

Y Zhang, Y Li, L Cui, D Cai, L Liu, T Fu… - arxiv preprint arxiv …, 2023 - arxiv.org
While large language models (LLMs) have demonstrated remarkable capabilities across a
range of downstream tasks, a significant concern revolves around their propensity to exhibit …

AI models collapse when trained on recursively generated data

I Shumailov, Z Shumaylov, Y Zhao, N Papernot… - Nature, 2024 - nature.com
Stable diffusion revolutionized image creation from descriptive text. GPT-2 (ref.), GPT-3 (.
5)(ref.) and GPT-4 (ref.) demonstrated high performance across a variety of language tasks …

A survey on LLM-generated text detection: Necessity, methods, and future directions

J Wu, S Yang, R Zhan, Y Yuan, LS Chao… - Computational …, 2025 - direct.mit.edu
The remarkable ability of large language models (LLMs) to comprehend, interpret, and
generate complex language has rapidly integrated LLM-generated text into various aspects …

The science of detecting LLM-generated text

R Tang, YN Chuang, X Hu - Communications of the ACM, 2024 - dl.acm.org
ACM: Digital Library: Communications of the ACM ACM Digital Library Communications of the
ACM Volume 67, Number 4 (2024), Pages 50-59 The Science of Detecting LLM-Generated Text …

Rl on incorrect synthetic data scales the efficiency of llm math reasoning by eight-fold

A Setlur, S Garg, X Geng, N Garg… - Advances in Neural …, 2025 - proceedings.neurips.cc
Training on model-generated synthetic data is a promising approach for finetuning LLMs,
but it remains unclear when it helps or hurts. In this paper, we investigate this question for …

The economic impacts and the regulation of AI: A review of the academic literature and policy actions

M Comunale, A Manera - 2024 - books.google.com
We review the literature on the effects of Artificial Intelligence (AI) adoption and the ongoing
regulatory efforts concerning this technology. Economic research encompasses growth …

Scalable watermarking for identifying large language model outputs

S Dathathri, A See, S Ghaisas, PS Huang, R McAdam… - Nature, 2024 - nature.com
Large language models (LLMs) have enabled the generation of high-quality synthetic text,
often indistinguishable from human-written content, at a scale that can markedly affect the …

Understanding hallucinations in diffusion models through mode interpolation

SK Aithal, P Maini, Z Lipton… - Advances in Neural …, 2025 - proceedings.neurips.cc
Colloquially speaking, image generation models based upon diffusion processes are
frequently said to exhibit''hallucinations''samples that could never occur in the training data …

Model collapse demystified: The case of regression

E Dohmatob, Y Feng, J Kempe - Advances in Neural …, 2025 - proceedings.neurips.cc
The era of proliferation of large language and image generation models begs the question
of what happens if models are trained on the synthesized outputs of other models. The …