Large language models and games: A survey and roadmap

R Gallotta, G Todd, M Zammit, S Earle… - IEEE Transactions …, 2024‏ - ieeexplore.ieee.org
Recent years have seen an explosive increase in research on large language models
(LLMs), and accompanying public engagement on the topic. While starting as a niche area …

[HTML][HTML] Review of machine learning in robotic gras** control in space application

H Jahanshahi, ZH Zhu - Acta Astronautica, 2024‏ - Elsevier
This article presents a comprehensive survey of the integration of machine learning
techniques into robotic gras**, with a special emphasis on the challenges and …

The llama 3 herd of models

A Dubey, A Jauhri, A Pandey, A Kadian… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Modern artificial intelligence (AI) systems are powered by foundation models. This paper
presents a new set of foundation models, called Llama 3. It is a herd of language models …

Secrets of rlhf in large language models part ii: Reward modeling

B Wang, R Zheng, L Chen, Y Liu, S Dou… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Reinforcement Learning from Human Feedback (RLHF) has become a crucial technology
for aligning language models with human values and intentions, enabling models to …

Rlhf deciphered: A critical analysis of reinforcement learning from human feedback for llms

S Chaudhari, P Aggarwal, V Murahari… - arxiv preprint arxiv …, 2024‏ - arxiv.org
State-of-the-art large language models (LLMs) have become indispensable tools for various
tasks. However, training LLMs to serve as effective assistants for humans requires careful …

Machine unlearning: Taxonomy, metrics, applications, challenges, and prospects

N Li, C Zhou, Y Gao, H Chen, Z Zhang… - … on Neural Networks …, 2025‏ - ieeexplore.ieee.org
Personal digital data is a critical asset, and governments worldwide have enforced laws and
regulations to protect data privacy. Data users have been endowed with the “right to be …

Provably robust dpo: Aligning language models with noisy feedback

SR Chowdhury, A Kini, N Natarajan - arxiv preprint arxiv:2403.00409, 2024‏ - arxiv.org
Learning from preference-based feedback has recently gained traction as a promising
approach to align language models with human interests. While these aligned generative …

A survey on pragmatic processing techniques

R Mao, M Ge, S Han, W Li, K He, L Zhu, E Cambria - Information Fusion, 2025‏ - Elsevier
Pragmatics, situated in the domains of linguistics and computational linguistics, explores the
influence of context on language interpretation, extending beyond the literal meaning of …

Bonbon alignment for large language models and the sweetness of best-of-n sampling

L Gui, C Gârbacea, V Veitch - arxiv preprint arxiv:2406.00832, 2024‏ - arxiv.org
This paper concerns the problem of aligning samples from large language models to human
preferences using best-of-$ n $ sampling, where we draw $ n $ samples, rank them, and …

Hallucinations in llms: Understanding and addressing challenges

G Perković, A Drobnjak, I Botički - 2024 47th MIPRO ICT and …, 2024‏ - ieeexplore.ieee.org
Large language models (LLM) are trained to understand and generate human-like
language. While LLMs present a cutting-edge concept and their use is becoming …