A survey of reinforcement learning from human feedback

T Kaufmann, P Weng, V Bengs… - arxiv preprint arxiv …, 2023 - arxiv.org
Reinforcement learning from human feedback (RLHF) is a variant of reinforcement learning
(RL) that learns from human feedback instead of relying on an engineered reward function …

Offline meta reinforcement learning with in-distribution online adaptation

J Wang, J Zhang, H Jiang, J Zhang… - International …, 2023 - proceedings.mlr.press
Recent offline meta-reinforcement learning (meta-RL) methods typically utilize task-
dependent behavior policies (eg, training RL agents on each individual task) to collect a …

Relative behavioral attributes: Filling the gap between symbolic goal specification and reward learning from human preferences

L Guan, K Valmeekam, S Kambhampati - arxiv preprint arxiv:2210.15906, 2022 - arxiv.org
Generating complex behaviors that satisfy the preferences of non-expert users is a crucial
requirement for AI agents. Interactive reward learning from trajectory comparisons (aka …

DGTRL: Deep graph transfer reinforcement learning method based on fusion of knowledge and data

G Chen, J Qi, Y Gao, X Zhu, Z Dong, Y Sun - Information Sciences, 2024 - Elsevier
Deep reinforcement learning has shown promising application effects in many fields.
However, issues such as low sample efficiency and weak knowledge transfer and …

On first-order meta-reinforcement learning with moreau envelopes

MT Toghani, S Perez-Salazar… - 2023 62nd IEEE …, 2023 - ieeexplore.ieee.org
Meta-Reinforcement Learning (MRL) is a promising framework for training agents that can
quickly adapt to new environments and tasks. In this work, we study the MRL problem under …

A Meta-reinforcement Learning based Hyperspectral Image Classification with Small Sample Set

PYO Amoako, G Cao, D Yang, L Amoah… - IEEE Journal of …, 2023 - ieeexplore.ieee.org
The fine spectral information contained in hyperspectral images (HSI) is combined with rich
spatial features to provide feature qualities that serve as distinguishing variables for efficient …

A Survey of Reinforcement Learning for Optimization in Automation

A Farooq, K Iqbal - 2024 IEEE 20th International Conference on …, 2024 - ieeexplore.ieee.org
Reinforcement Learning (RL) has become a critical tool for optimization challenges within
automation, leading to significant advancements in several areas. This review article …

Taming the Sample Complexity in Agentifying AI Systems by the Exploitation of Explicit Human Knowledge

L Guan - 2024 - search.proquest.com
Extensive efforts have been dedicated to the development of AI agents that can
independently carry out sequential decision-making tasks. Learning-based solutions …

Towards Scalable and Personalized Collaborative Learning

MT Toghani - 2024 - search.proquest.com
This thesis studies collaborative learning framework, where a group of agents cooperate to
learn a powerful model from their local data in a distributed or decentralized manner. We …

Foundation of Scalable Constraint Learning from Human Feedback

T Kozuno, H Kondoh, K Tanaka - openreview.net
Constraint learning from human feedback (CLHF) has garnered significant interest in the
domain of safe reinforcement learning (RL) due to the challenges associated with designing …