A review of robot learning for manipulation: Challenges, representations, and algorithms
A key challenge in intelligent robotics is creating robots that are capable of directly
interacting with the world around them to achieve their goals. The last decade has seen …
interacting with the world around them to achieve their goals. The last decade has seen …
Conservative q-learning for offline reinforcement learning
Effectively leveraging large, previously collected datasets in reinforcement learn-ing (RL) is
a key challenge for large-scale real-world applications. Offline RL algorithms promise to …
a key challenge for large-scale real-world applications. Offline RL algorithms promise to …
Off-policy deep reinforcement learning without exploration
Many practical applications of reinforcement learning constrain agents to learn from a fixed
batch of data which has already been gathered, without offering further possibility for data …
batch of data which has already been gathered, without offering further possibility for data …
The artificial intelligence clinician learns optimal treatment strategies for sepsis in intensive care
Sepsis is the third leading cause of death worldwide and the main cause of mortality in
hospitals,–, but the best treatment strategy remains uncertain. In particular, evidence …
hospitals,–, but the best treatment strategy remains uncertain. In particular, evidence …
Doubly robust off-policy value evaluation for reinforcement learning
We study the problem of off-policy value evaluation in reinforcement learning (RL), where
one aims to estimate the value of a new policy based on data collected by a different policy …
one aims to estimate the value of a new policy based on data collected by a different policy …
Provably good batch off-policy reinforcement learning without great exploration
Batch reinforcement learning (RL) is important to apply RL algorithms to many high stakes
tasks. Doing batch RL in a way that yields a reliable new policy in large domains is …
tasks. Doing batch RL in a way that yields a reliable new policy in large domains is …
Provable benefits of actor-critic methods for offline reinforcement learning
Actor-critic methods are widely used in offline reinforcement learningpractice, but are not so
well-understood theoretically. We propose a newoffline actor-critic algorithm that naturally …
well-understood theoretically. We propose a newoffline actor-critic algorithm that naturally …
Decision-making under uncertainty: beyond probabilities: Challenges and perspectives
This position paper reflects on the state-of-the-art in decision-making under uncertainty. A
classical assumption is that probabilities can sufficiently capture all uncertainty in a system …
classical assumption is that probabilities can sufficiently capture all uncertainty in a system …
More robust doubly robust off-policy evaluation
We study the problem of off-policy evaluation (OPE) in reinforcement learning (RL), where
the goal is to estimate the performance of a policy from the data generated by another policy …
the goal is to estimate the performance of a policy from the data generated by another policy …
Preventing undesirable behavior of intelligent machines
Intelligent machines using machine learning algorithms are ubiquitous, ranging from simple
data analysis and pattern recognition tools to complex systems that achieve superhuman …
data analysis and pattern recognition tools to complex systems that achieve superhuman …