Reinforcement learning: An overview
K Murphy - arxiv preprint arxiv:2412.05265, 2024 - arxiv.org
This manuscript gives a big-picture, up-to-date overview of the field of (deep) reinforcement
learning and sequential decision making, covering value-based RL, policy-gradient …
learning and sequential decision making, covering value-based RL, policy-gradient …
RL, but don't do anything I wouldn't do
In reinforcement learning, if the agent's reward differs from the designers' true utility, even
only rarely, the state distribution resulting from the agent's policy can be very bad, in theory …
only rarely, the state distribution resulting from the agent's policy can be very bad, in theory …
[PDF][PDF] Limit-Computable Grains of Truth for Arbitrary Computable Extensive-Form (Un) Known Games
A Bayesian agent acting in a multi-agent environment learns to predict the other agents'
policies if its prior assigns positive probability to them (in other words, its prior contains a …
policies if its prior assigns positive probability to them (in other words, its prior contains a …