- Academic Search

C Zhou, Q Li, C Li, J Yu, Y Liu, G Wang… - International Journal of …, 2024 - Springer

Abstract Pretrained Foundation Models (PFMs) are regarded as the foundation for various
downstream tasks across different data modalities. A PFM (eg, BERT, ChatGPT, GPT-4) is …

Enregistrer Citer Cité 609 fois Autres articles Les 2 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Exploration in deep reinforcement learning: A survey

P Ladosz, L Weng, M Kim, H Oh - Information Fusion, 2022 - Elsevier

This paper reviews exploration techniques in deep reinforcement learning. Exploration
techniques are of primary importance when solving sparse reward problems. In sparse …

Enregistrer Citer Cité 372 fois Autres articles Les 5 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Emergent tool use from multi-agent autocurricula

B Baker, I Kanitscheider, T Markov, Y Wu… - arxiv preprint arxiv …, 2019 - arxiv.org

Through multi-agent competition, the simple objective of hide-and-seek, and standard
reinforcement learning algorithms at scale, we find that agents create a self-supervised …

Enregistrer Citer Cité 912 fois Autres articles Les 3 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] jair.org

Towards continual reinforcement learning: A review and perspectives

K Khetarpal, M Riemer, I Rish, D Precup - Journal of Artificial Intelligence …, 2022 - jair.org

In this article, we aim to provide a literature review of different formulations and approaches
to continual reinforcement learning (RL), also known as lifelong or non-stationary RL. We …

Enregistrer Citer Cité 338 fois Autres articles Les 9 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] mlr.press

Planning to explore via self-supervised world models

R Sekar, O Rybkin, K Daniilidis… - International …, 2020 - proceedings.mlr.press

Reinforcement learning allows solving complex tasks, however, the learning tends to be task-
specific and the sample efficiency remains a challenge. We present Plan2Explore, a self …

Enregistrer Citer Cité 458 fois Autres articles Les 8 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] ed.ac.uk

Exploration by random network distillation

Y Burda, H Edwards, A Storkey, O Klimov - arxiv preprint arxiv …, 2018 - arxiv.org

We introduce an exploration bonus for deep reinforcement learning methods that is easy to
implement and adds minimal overhead to the computation performed. The bonus is the error …

Enregistrer Citer Cité 1589 fois Autres articles Les 10 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] nowpublishers.com

Model-based reinforcement learning: A survey

TM Moerland, J Broekens, A Plaat… - … and Trends® in …, 2023 - nowpublishers.com

Sequential decision making, commonly formalized as Markov Decision Process (MDP)
optimization, is an important challenge in artificial intelligence. Two key approaches to this …

Enregistrer Citer Cité 922 fois Autres articles Les 17 versions Free GPT-4 Recherche dans les bibliothèques Version HTML

[Free GPT-4]

[PDF] arxiv.org

Large-scale study of curiosity-driven learning

Y Burda, H Edwards, D Pathak, A Storkey… - arxiv preprint arxiv …, 2018 - arxiv.org

Reinforcement learning algorithms rely on carefully engineering environment rewards that
are extrinsic to the agent. However, annotating each environment with hand-designed …

Enregistrer Citer Cité 918 fois Autres articles Les 9 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] mlr.press

Aps: Active pretraining with successor features

H Liu, P Abbeel - International Conference on Machine …, 2021 - proceedings.mlr.press

We introduce a new unsupervised pretraining objective for reinforcement learning. During
the unsupervised reward-free pretraining phase, the agent maximizes mutual information …

Enregistrer Citer Cité 151 fois Autres articles Les 5 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] mlr.press

Self-supervised exploration via disagreement

D Pathak, D Gandhi, A Gupta - International conference on …, 2019 - proceedings.mlr.press

Efficient exploration is a long-standing problem in sensorimotor learning. Major advances
have been demonstrated in noise-free, non-stochastic domains such as video games and …

Enregistrer Citer Cité 455 fois Autres articles Les 6 versions Free GPT-4 Version HTML

Créer l'alerte

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

Surprise-based intrinsic motivation for deep reinforcement learning

A comprehensive survey on pretrained foundation models: A history from bert to chatgpt

Exploration in deep reinforcement learning: A survey

Emergent tool use from multi-agent autocurricula

Towards continual reinforcement learning: A review and perspectives

Planning to explore via self-supervised world models

Exploration by random network distillation

Model-based reinforcement learning: A survey

Large-scale study of curiosity-driven learning

Aps: Active pretraining with successor features

Self-supervised exploration via disagreement