- Academic Search

C Yu, A Velu, E Vinitsky, J Gao… - Advances in …, 2022 - proceedings.neurips.cc

Abstract Proximal Policy Optimization (PPO) is a ubiquitous on-policy reinforcement learning
algorithm but is significantly less utilized than off-policy learning algorithms in multi-agent …

Save Cite Cited by 1525 Related articles All 12 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

A minimalist approach to offline reinforcement learning

S Fujimoto, SS Gu - Advances in neural information …, 2021 - proceedings.neurips.cc

Offline reinforcement learning (RL) defines the task of learning from a fixed batch of data.
Due to errors in value estimation from out-of-distribution actions, most offline RL algorithms …

Save Cite Cited by 853 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

Efficient online reinforcement learning with offline data

PJ Ball, L Smith, I Kostrikov… - … Conference on Machine …, 2023 - proceedings.mlr.press

Sample efficiency and exploration remain major challenges in online reinforcement learning
(RL). A powerful approach that can be applied to address these issues is the inclusion of …

Save Cite Cited by 141 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

Rambo-rl: Robust adversarial model-based offline reinforcement learning

M Rigter, B Lacerda, N Hawes - Advances in neural …, 2022 - proceedings.neurips.cc

Offline reinforcement learning (RL) aims to find performant policies from logged data without
further environment interaction. Model-based algorithms, which learn a model of the …

Save Cite Cited by 130 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] nature.com

Loss of plasticity in deep continual learning

S Dohare, JF Hernandez-Garcia, Q Lan, P Rahman… - Nature, 2024 - nature.com

Artificial neural networks, deep-learning methods and the backpropagation algorithm form
the foundation of modern machine learning and artificial intelligence. These methods are …

Save Cite Cited by 44 Related articles All 2 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Secrets of rlhf in large language models part i: Ppo

R Zheng, S Dou, S Gao, Y Hua, W Shen… - arxiv preprint arxiv …, 2023 - arxiv.org

Large language models (LLMs) have formulated a blueprint for the advancement of artificial
general intelligence. Its primary objective is to function as a human-centric (helpful, honest …

Save Cite Cited by 103 Related articles All 4 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Recurrent model-free rl can be a strong baseline for many pomdps

T Ni, B Eysenbach, R Salakhutdinov - arxiv preprint arxiv:2110.05038, 2021 - arxiv.org

Many problems in RL, such as meta-RL, robust RL, generalization in RL, and temporal credit
assignment, can be cast as POMDPs. In theory, simply augmenting model-free RL with …

Save Cite Cited by 120 Related articles All 4 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

A survey on transformers in reinforcement learning

W Li, H Luo, Z Lin, C Zhang, Z Lu, D Ye - arxiv preprint arxiv:2301.03044, 2023 - arxiv.org

Transformer has been considered the dominating neural architecture in NLP and CV, mostly
under supervised settings. Recently, a similar surge of using Transformers has appeared in …

Save Cite Cited by 69 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

Hyperparameters in reinforcement learning and how to tune them

T Eimer, M Lindauer… - … Conference on Machine …, 2023 - proceedings.mlr.press

In order to improve reproducibility, deep reinforcement learning (RL) has been adopting
better scientific practices such as standardized evaluation metrics and reporting. However …

Save Cite Cited by 47 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Reinforcement learning in practice: Opportunities and challenges

Y Li - arxiv preprint arxiv:2202.11296, 2022 - arxiv.org

This article is a gentle discussion about the field of reinforcement learning in practice, about
opportunities and challenges, touching a broad range of topics, with perspectives and …

Save Cite Cited by 28 Related articles All 2 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

What matters for on-policy deep actor-critic methods? a large-scale study

The surprising effectiveness of ppo in cooperative multi-agent games

A minimalist approach to offline reinforcement learning

Efficient online reinforcement learning with offline data

Rambo-rl: Robust adversarial model-based offline reinforcement learning

Loss of plasticity in deep continual learning

Secrets of rlhf in large language models part i: Ppo

Recurrent model-free rl can be a strong baseline for many pomdps

A survey on transformers in reinforcement learning

Hyperparameters in reinforcement learning and how to tune them

Reinforcement learning in practice: Opportunities and challenges