Automatically correcting large language models: Surveying the landscape of diverse self-correction strategies

L Pan, M Saxon, W Xu, D Nathani, X Wang… - arxiv preprint arxiv …, 2023 - arxiv.org
Large language models (LLMs) have demonstrated remarkable performance across a wide
array of NLP tasks. However, their efficacy is undermined by undesired and inconsistent …

A survey of large language models

WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou… - arxiv preprint arxiv …, 2023 - arxiv.org
Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …

Metamath: Bootstrap your own mathematical questions for large language models

L Yu, W Jiang, H Shi, J Yu, Z Liu, Y Zhang… - arxiv preprint arxiv …, 2023 - arxiv.org
Large language models (LLMs) have pushed the limits of natural language understanding
and exhibited excellent problem-solving ability. Despite the great success, most existing …

Lora learns less and forgets less

D Biderman, J Portes, JJG Ortiz, M Paul… - arxiv preprint arxiv …, 2024 - arxiv.org
Low-Rank Adaptation (LoRA) is a widely-used parameter-efficient finetuning method for
large language models. LoRA saves memory by training only low rank perturbations to …

LLMs can't plan, but can help planning in LLM-modulo frameworks

S Kambhampati, K Valmeekam, L Guan… - arxiv preprint arxiv …, 2024 - arxiv.org
There is considerable confusion about the role of Large Language Models (LLMs) in
planning and reasoning tasks. On one side are over-optimistic claims that LLMs can indeed …

The caring machine: Feeling AI for customer care

MH Huang, RT Rust - Journal of Marketing, 2024 - journals.sagepub.com
Customer care is important for its role in relationship building. This role has traditionally
been performed by human customer agents; however, the emergence of interactive …

Llm self defense: By self examination, llms know they are being tricked

M Phute, A Helbling, M Hull, SY Peng, S Szyller… - arxiv preprint arxiv …, 2023 - arxiv.org
Large language models (LLMs) are popular for high-quality text generation but can produce
harmful content, even when aligned with human values through reinforcement learning …

Trueteacher: Learning factual consistency evaluation with large language models

Z Gekhman, J Herzig, R Aharoni, C Elkind… - arxiv preprint arxiv …, 2023 - arxiv.org
Factual consistency evaluation is often conducted using Natural Language Inference (NLI)
models, yet these models exhibit limited success in evaluating summaries. Previous work …

Personal llm agents: Insights and survey about the capability, efficiency and security

Y Li, H Wen, W Wang, X Li, Y Yuan, G Liu, J Liu… - arxiv preprint arxiv …, 2024 - arxiv.org
Since the advent of personal computing devices, intelligent personal assistants (IPAs) have
been one of the key technologies that researchers and engineers have focused on, aiming …

Distilling system 2 into system 1

P Yu, J Xu, J Weston, I Kulikov - arxiv preprint arxiv:2407.06023, 2024 - arxiv.org
Large language models (LLMs) can spend extra compute during inference to generate
intermediate thoughts, which helps to produce better final responses. Since Chain-of …