Google Наука

Y Chandak, S Niekum, B da Silva… - Advances in …, 2021 - proceedings.neurips.cc

When faced with sequential decision-making problems, it is often useful to be able to predict
what would happen if decisions were made using a new policy. Those predictions must …

Запазване Позоваване С позовавания в 59 Сродни статии Всички 11 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Learning to identify critical states for reinforcement learning from videos

H Liu, M Zhuge, B Li, Y Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Recent work on deep reinforcement learning (DRL) has pointed out that algorithmic
information about good policies can be extracted from offline data which lack explicit …

Запазване Позоваване С позовавания в 10 Сродни статии Всички 10 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A survey of temporal credit assignment in deep reinforcement learning

E Pignatelli, J Ferret, M Geist, T Mesnard… - arxiv preprint arxiv …, 2023 - arxiv.org

The Credit Assignment Problem (CAP) refers to the longstanding challenge of
Reinforcement Learning (RL) agents to associate actions with their long-term …

Запазване Позоваване С позовавания в 14 Сродни статии Всички 3 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Learning useful representations of recurrent neural network weight matrices

V Herrmann, F Faccio, J Schmidhuber - arxiv preprint arxiv:2403.11998, 2024 - arxiv.org

Recurrent Neural Networks (RNNs) are general-purpose parallel-sequential computers. The
program of an RNN is its weight matrix. How to learn useful representations of RNN weights …

Запазване Позоваване С позовавания в 7 Сродни статии Всички 9 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] aaai.org

Goal-conditioned generators of deep policies

F Faccio, V Herrmann, A Ramesh, L Kirsch… - Proceedings of the …, 2023 - ojs.aaai.org

Abstract Goal-conditioned Reinforcement Learning (RL) aims at learning optimal policies,
given goals encoded in special command inputs. Here we study goal-conditioned neural …

Запазване Позоваване С позовавания в 14 Сродни статии Всички 12 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] aaai.org

What about inputting policy in value function: Policy representation and policy-extended value function approximator

H Tang, Z Meng, J Hao, C Chen, D Graves… - Proceedings of the …, 2022 - ojs.aaai.org

Abstract We study Policy-extended Value Function Approximator (PeVFA) in Reinforcement
Learning (RL), which extends conventional value function approximator (VFA) to take as …

Запазване Позоваване С позовавания в 24 Сродни статии Всички 6 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] ijcai.org

[PDF][PDF] Learning Efficient Truthful Mechanisms for Trading Networks.

T Osogami, S Wasserkrug, ES Shamash - IJCAI, 2023 - ijcai.org

Trading networks are an indispensable part of today's economy, but to compete successfully
with others, they must be efficient in maximizing the value they provide to the external …

Запазване Позоваване С позовавания в 5 Сродни статии Всички 3 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

General policy evaluation and improvement by learning to identify few but crucial states

F Faccio, A Ramesh, V Herrmann, J Harb… - arxiv preprint arxiv …, 2022 - arxiv.org

Learning to evaluate and improve policies is a core problem of Reinforcement Learning
(RL). Traditional RL algorithms learn a value function defined for a single policy. A recently …

Запазване Позоваване С позовавания в 12 Сродни статии Всички 6 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Exploring through random curiosity with general value functions

A Ramesh, L Kirsch, S van Steenkiste… - Advances in Neural …, 2022 - proceedings.neurips.cc

Efficient exploration in reinforcement learning is a challenging problem commonly
addressed through intrinsic rewards. Recent prominent approaches are based on state …

Запазване Позоваване С позовавания в 11 Сродни статии Всички 12 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Learning one abstract bit at a time through self-invented experiments encoded as neural networks

V Herrmann, L Kirsch, J Schmidhuber - International Workshop on Active …, 2023 - Springer

There are two important things in science:(A) Finding answers to given questions, and (B)
Coming up with good questions. Our artificial scientists not only learn to answer given …

Запазване Позоваване С позовавания в 5 Сродни статии Всички 7 версии

Създаване на сигнал

Позоваване

Разширено търсене

Запазено в „Моята библиотека“

Parameter-based value functions

Universal off-policy evaluation

Learning to identify critical states for reinforcement learning from videos

A survey of temporal credit assignment in deep reinforcement learning

Learning useful representations of recurrent neural network weight matrices

Goal-conditioned generators of deep policies

What about inputting policy in value function: Policy representation and policy-extended value function approximator

[PDF][PDF] Learning Efficient Truthful Mechanisms for Trading Networks.

General policy evaluation and improvement by learning to identify few but crucial states

Exploring through random curiosity with general value functions

Learning one abstract bit at a time through self-invented experiments encoded as neural networks