A survey of preference-based reinforcement learning methods
Reinforcement learning (RL) techniques optimize the accumulated long-term reward of a
suitably chosen reward function. However, designing such a reward function often requires …
suitably chosen reward function. However, designing such a reward function often requires …
A survey of reinforcement learning from human feedback
Reinforcement learning from human feedback (RLHF) is a variant of reinforcement learning
(RL) that learns from human feedback instead of relying on an engineered reward function …
(RL) that learns from human feedback instead of relying on an engineered reward function …
Dueling posterior sampling for preference-based reinforcement learning
In preference-based reinforcement learning (RL), an agent interacts with the environment
while receiving preferences instead of absolute feedback. While there is increasing research …
while receiving preferences instead of absolute feedback. While there is increasing research …
Learning state importance for preference-based reinforcement learning
Preference-based reinforcement learning (PbRL) develops agents using human
preferences. Due to its empirical success, it has prospect of benefiting human-centered …
preferences. Due to its empirical success, it has prospect of benefiting human-centered …
[PDF][PDF] Preference-based reinforcement learning: A preliminary survey
Preference-based reinforcement learning has gained significant popularity over the years,
but it is still unclear what exactly preference learning is and how it relates to other …
but it is still unclear what exactly preference learning is and how it relates to other …
A Survey on Human Preference Learning for Large Language Models
The recent surge of versatile large language models (LLMs) largely depends on aligning
increasingly capable foundation models with human intentions by preference learning …
increasingly capable foundation models with human intentions by preference learning …
Task transfer by preference-based cost learning
The goal of task transfer in reinforcement learning is migrating the action policy of an agent
to the target task from the source task. Given their successes on robotic action planning …
to the target task from the source task. Given their successes on robotic action planning …
EPMC: Every visit preference Monte Carlo for reinforcement learning
Reinforcement learning algorithms are usually hard to use for non expert users. It is required
to consider several aspects like the definition of state-, action-and reward-space as well as …
to consider several aspects like the definition of state-, action-and reward-space as well as …
Adapting Robotic Systems to User Control
U Biswas - 2023 - search.proquest.com
In this work, I propose to bridge the gap between human users and adaptive control of
robotic systems. The goal is to enable robots to consider user feedback and adjust their …
robotic systems. The goal is to enable robots to consider user feedback and adjust their …
What Do You Want Me to Do? Addressing Model Differences for Human-Aware Decision-Making from a Learning Perspective
Z Gong - 2022 - search.proquest.com
As intelligent agents become pervasive in our lives, they are expected to not only achieve
tasks alone but also engage in tasks with humans in the loop. In such cases, the human …
tasks alone but also engage in tasks with humans in the loop. In such cases, the human …