Reinforcement learning for generative ai: State of the art, opportunities and open research challenges

G Franceschelli, M Musolesi - Journal of Artificial Intelligence Research, 2024 - jair.org
Abstract Generative Artificial Intelligence (AI) is one of the most exciting developments in
Computer Science of the last decade. At the same time, Reinforcement Learning (RL) has …

Self-refine: Iterative refinement with self-feedback

A Madaan, N Tandon, P Gupta… - Advances in …, 2024 - proceedings.neurips.cc
Like humans, large language models (LLMs) do not always generate the best output on their
first try. Motivated by how humans refine their written text, we introduce Self-Refine, an …

Augmented language models: a survey

G Mialon, R Dessì, M Lomeli, C Nalmpantis… - arxiv preprint arxiv …, 2023 - arxiv.org
This survey reviews works in which language models (LMs) are augmented with reasoning
skills and the ability to use tools. The former is defined as decomposing a potentially …

Language models meet world models: Embodied experiences enhance language models

J **ang, T Tao, Y Gu, T Shu, Z Wang… - Advances in neural …, 2023 - proceedings.neurips.cc
While large language models (LMs) have shown remarkable capabilities across numerous
tasks, they often struggle with simple reasoning and planning in physical environments …

Reasoning with language model prompting: A survey

S Qiao, Y Ou, N Zhang, X Chen, Y Yao, S Deng… - arxiv preprint arxiv …, 2022 - arxiv.org
Reasoning, as an essential ability for complex problem-solving, can provide back-end
support for various real-world applications, such as medical diagnosis, negotiation, etc. This …

Rl4f: Generating natural language feedback with reinforcement learning for repairing model outputs

AF Akyürek, E Akyürek, A Madaan, A Kalyan… - arxiv preprint arxiv …, 2023 - arxiv.org
Despite their unprecedented success, even the largest language models make mistakes.
Similar to how humans learn and improve using feedback, previous work proposed …

Retrieve-rewrite-answer: A kg-to-text enhanced llms framework for knowledge graph question answering

Y Wu, N Hu, S Bi, G Qi, J Ren, A **e… - arxiv preprint arxiv …, 2023 - arxiv.org
Despite their competitive performance on knowledge-intensive tasks, large language
models (LLMs) still have limitations in memorizing all world knowledge especially long tail …

Intentqa: Context-aware video intent reasoning

J Li, P Wei, W Han, L Fan - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
In this paper, we propose a novel task IntentQA, a special VideoQA task focusing on video
intent reasoning, which has become increasingly important for AI with its advantages in …

Large language models are versatile decomposers: Decomposing evidence and questions for table-based reasoning

Y Ye, B Hui, M Yang, B Li, F Huang, Y Li - Proceedings of the 46th …, 2023 - dl.acm.org
Table-based reasoning has shown remarkable progress in a wide range of table-based
tasks. It is a challenging task, which requires reasoning over both free-form natural language …

Complex QA and language models hybrid architectures, Survey

X Daull, P Bellot, E Bruno, V Martin… - arxiv preprint arxiv …, 2023 - arxiv.org
This paper reviews the state-of-the-art of language models architectures and strategies for"
complex" question-answering (QA, CQA, CPS) with a focus on hybridization. Large …