- Academic Search

O Kroemer, S Niekum, G Konidaris - Journal of machine learning research, 2021 - jmlr.org

A key challenge in intelligent robotics is creating robots that are capable of directly
interacting with the world around them to achieve their goals. The last decade has seen …

Save Cite Cited by 446 Related articles All 18 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

Conservative q-learning for offline reinforcement learning

A Kumar, A Zhou, G Tucker… - Advances in Neural …, 2020 - proceedings.neurips.cc

Effectively leveraging large, previously collected datasets in reinforcement learn-ing (RL) is
a key challenge for large-scale real-world applications. Offline RL algorithms promise to …

Save Cite Cited by 2046 Related articles All 10 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

Off-policy deep reinforcement learning without exploration

S Fujimoto, D Meger, D Precup - … conference on machine …, 2019 - proceedings.mlr.press

Many practical applications of reinforcement learning constrain agents to learn from a fixed
batch of data which has already been gathered, without offering further possibility for data …

Save Cite Cited by 1790 Related articles All 9 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] nematilab.info

The artificial intelligence clinician learns optimal treatment strategies for sepsis in intensive care

M Komorowski, LA Celi, O Badawi, AC Gordon… - Nature medicine, 2018 - nature.com

Sepsis is the third leading cause of death worldwide and the main cause of mortality in
hospitals,–, but the best treatment strategy remains uncertain. In particular, evidence …

Save Cite Cited by 1160 Related articles All 11 versions Free GPT-4

[Free GPT-4]

[PDF] mlr.press

Doubly robust off-policy value evaluation for reinforcement learning

N Jiang, L Li - International conference on machine learning, 2016 - proceedings.mlr.press

We study the problem of off-policy value evaluation in reinforcement learning (RL), where
one aims to estimate the value of a new policy based on data collected by a different policy …

Save Cite Cited by 880 Related articles All 10 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

Provably good batch off-policy reinforcement learning without great exploration

Y Liu, A Swaminathan, A Agarwal… - Advances in neural …, 2020 - proceedings.neurips.cc

Batch reinforcement learning (RL) is important to apply RL algorithms to many high stakes
tasks. Doing batch RL in a way that yields a reliable new policy in large domains is …

Save Cite Cited by 231 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

Provable benefits of actor-critic methods for offline reinforcement learning

A Zanette, MJ Wainwright… - Advances in neural …, 2021 - proceedings.neurips.cc

Actor-critic methods are widely used in offline reinforcement learningpractice, but are not so
well-understood theoretically. We propose a newoffline actor-critic algorithm that naturally …

Save Cite Cited by 145 Related articles All 8 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] springer.com

Decision-making under uncertainty: beyond probabilities: Challenges and perspectives

T Badings, TD Simão, M Suilen, N Jansen - International Journal on …, 2023 - Springer

This position paper reflects on the state-of-the-art in decision-making under uncertainty. A
classical assumption is that probabilities can sufficiently capture all uncertainty in a system …

Save Cite Cited by 17 Related articles All 8 versions Free GPT-4

[Free GPT-4]

[PDF] mlr.press

More robust doubly robust off-policy evaluation

M Farajtabar, Y Chow… - … on Machine Learning, 2018 - proceedings.mlr.press

We study the problem of off-policy evaluation (OPE) in reinforcement learning (RL), where
the goal is to estimate the performance of a policy from the data generated by another policy …

Save Cite Cited by 291 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] nsf.gov

Preventing undesirable behavior of intelligent machines

PS Thomas, B Castro da Silva, AG Barto, S Giguere… - Science, 2019 - science.org

Intelligent machines using machine learning algorithms are ubiquitous, ranging from simple
data analysis and pattern recognition tools to complex systems that achieve superhuman …

Save Cite Cited by 211 Related articles All 11 versions Free GPT-4

Create alert

Cite

Advanced search

Saved to My library

High confidence policy improvement

A review of robot learning for manipulation: Challenges, representations, and algorithms

Conservative q-learning for offline reinforcement learning

Off-policy deep reinforcement learning without exploration

The artificial intelligence clinician learns optimal treatment strategies for sepsis in intensive care

Doubly robust off-policy value evaluation for reinforcement learning

Provably good batch off-policy reinforcement learning without great exploration

Provable benefits of actor-critic methods for offline reinforcement learning

Decision-making under uncertainty: beyond probabilities: Challenges and perspectives

More robust doubly robust off-policy evaluation

Preventing undesirable behavior of intelligent machines