Explicable reward design for reinforcement learning agents

R Devidze, G Radanovic… - Advances in neural …, 2021 - proceedings.neurips.cc
We study the design of explicable reward functions for a reinforcement learning agent while
guaranteeing that an optimal policy induced by the function belongs to a set of target …

Policy teaching in reinforcement learning via environment poisoning attacks

A Rakhsha, G Radanovic, R Devidze, X Zhu… - Journal of Machine …, 2021 - jmlr.org
We study a security threat to reinforcement learning where an attacker poisons the learning
environment to force the agent into executing a target policy chosen by the attacker. As a …

Informativeness of Reward Functions in Reinforcement Learning

R Devidze, P Kamalaruban, A Singla - arxiv preprint arxiv:2402.07019, 2024 - arxiv.org
Reward functions are central in specifying the task we want a reinforcement learning agent
to perform. Given a task and desired optimal behavior, we study the problem of designing …

Causeoccam: Learning interpretable abstract representations in reinforcement learning environments via model sparsity

S Volodin - 2021 - infoscience.epfl.ch
Abstract" I choose this restaurant because they have vegan sandwiches" could be a typical
explanation we would expect from a human. However, current Reinforcement Learning (RL) …

[PDF][PDF] Multi-expert Preference Alignment in Reinforcement Learning

L Li - 2024 - repository.tudelft.nl
I was attracted to this project from the beginning, as the concept aligns with what I have
always wanted to pursue since deciding to major in computer science. I am broadly …

[CITA][C] Improving performance of deep reinforcement learning by incorporating human expertise

QVH Nguyen, I Razzak, D Van Le, VJ Reddi - 2022 - unpublished