Google 학술 검색

[PDF][PDF] Predictive representations: Building blocks of intelligence

W Carvalho, MS Tomov, W de Cothi, C Barry… - Neural …, 2024 - direct.mit.edu

Adaptive behavior often requires predicting future events. The theory of reinforcement
learning prescribes what kinds of predictive representations are useful and how to compute …

저장 인용 6회 인용 관련 학술자료 전체 2개의 버전

[Free GPT-4]

[PDF] neurips.cc

For sale: State-action representation learning for deep reinforcement learning

S Fujimoto, WD Chang, E Smith… - Advances in …, 2024 - proceedings.neurips.cc

In reinforcement learning (RL), representation learning is a proven tool for complex image-
based tasks, but is often overlooked for environments with low-level states, such as physical …

저장 인용 55회 인용 관련 학술자료 전체 5개의 버전 HTML 버전

[Free GPT-4]

[PDF] arxiv.org

Data representativity for machine learning and AI systems

LH Clemmensen, RD Kjærsgaard - arxiv preprint arxiv:2203.04706, 2022 - arxiv.org

Data representativity is crucial when drawing inference from data through machine learning
models. Scholars have increased focus on unraveling the bias and fairness in models, also …

저장 인용 24회 인용 관련 학술자료 전체 3개의 버전 HTML 버전

[Free GPT-4]

[PDF] neurips.cc

Beyond uniform sampling: Offline reinforcement learning with imbalanced datasets

ZW Hong, A Kumar, S Karnik… - Advances in …, 2023 - proceedings.neurips.cc

Offline reinforcement learning (RL) enables learning a decision-making policy without
interaction with the environment. This makes it particularly beneficial in situations where …

저장 인용 15회 인용 관련 학술자료 전체 6개의 버전 HTML 버전

[Free GPT-4]

[PDF] neurips.cc

Marginal density ratio for off-policy evaluation in contextual bandits

MF Taufiq, A Doucet, R Cornish… - Advances in Neural …, 2024 - proceedings.neurips.cc

Abstract Off-Policy Evaluation (OPE) in contextual bandits is crucial for assessing new
policies using existing data without costly experimentation. However, current OPE methods …

저장 인용 4회 인용 관련 학술자료 전체 4개의 버전 HTML 버전

[Free GPT-4]

[PDF] arxiv.org

Why should i trust you, bellman? the bellman error is a poor replacement for value error

S Fujimoto, D Meger, D Precup, O Nachum… - arxiv preprint arxiv …, 2022 - arxiv.org

In this work, we study the use of the Bellman equation as a surrogate objective for value
prediction accuracy. While the Bellman equation is uniquely solved by the true value …

저장 인용 37회 인용 관련 학술자료 전체 3개의 버전 HTML 버전

[Free GPT-4]

[PDF] arxiv.org

Sample complexity of nonparametric off-policy evaluation on low-dimensional manifolds using deep networks

X Ji, M Chen, M Wang, T Zhao - arxiv preprint arxiv:2206.02887, 2022 - arxiv.org

We consider the off-policy evaluation problem of reinforcement learning using deep
convolutional neural networks. We analyze the deep fitted Q-evaluation method for …

저장 인용 19회 인용 관련 학술자료 전체 5개의 버전 HTML 버전

[Free GPT-4]

[PDF] arxiv.org

Composing task knowledge with modular successor feature approximators

W Carvalho, A Filos, RL Lewis, S Singh - arxiv preprint arxiv …, 2023 - arxiv.org

Recently, the Successor Features and Generalized Policy Improvement (SF&GPI) framework
has been proposed as a method for learning, composing, and transferring predictive …

저장 인용 11회 인용 관련 학술자료 전체 4개의 버전 HTML 버전

Policy Correction and State-Conditioned Action Evaluation for Few-Shot Lifelong Deep Reinforcement Learning

M Xu, X Chen, J Wang - IEEE Transactions on Neural …, 2024 - ieeexplore.ieee.org

Lifelong deep reinforcement learning (DRL) approaches are commonly employed to adapt
continuously to new tasks without forgetting previously acquired knowledge. While current …

저장 인용 2회 인용 관련 학술자료 전체 5개의 버전

[Free GPT-4]

[PDF] ieee.org

Knowledge transfer in multi-agent reinforcement learning with incremental number of agents

W Liu, L Dong, J Liu, C Sun - Journal of systems engineering …, 2022 - ieeexplore.ieee.org

In this paper, the reinforcement learning method for cooperative multi-agent systems (MAS)
with incremental number of agents is studied. The existing multi-agent reinforcement …

저장 인용 5회 인용 관련 학술자료

인용

고급 검색

라이브러리에 저장됨

[PDF][PDF] Predictive representations: Building blocks of intelligence

For sale: State-action representation learning for deep reinforcement learning

Data representativity for machine learning and AI systems

Beyond uniform sampling: Offline reinforcement learning with imbalanced datasets

Marginal density ratio for off-policy evaluation in contextual bandits

Why should i trust you, bellman? the bellman error is a poor replacement for value error

Sample complexity of nonparametric off-policy evaluation on low-dimensional manifolds using deep networks

Composing task knowledge with modular successor feature approximators

Policy Correction and State-Conditioned Action Evaluation for Few-Shot Lifelong Deep Reinforcement Learning

Knowledge transfer in multi-agent reinforcement learning with incremental number of agents