Google Académico

[HTML][HTML] Explainable AI for operational research: A defining framework, methods, applications, and a research agenda

KW De Bock, K Coussement, A De Caigny… - European Journal of …, 2024 - Elsevier

The ability to understand and explain the outcomes of data analysis methods, with regard to
aiding decision-making, has become a critical requirement for many applications. For …

Guardar Citar Citado por 58 Artículos relacionados Las 11 versiones

[Free GPT-4]

[PDF] annualreviews.org

Causal inference in the social sciences

GW Imbens - Annual Review of Statistics and Its Application, 2024 - annualreviews.org

Knowledge of causal effects is of great importance to decision makers in a wide variety of
settings. In many cases, however, these causal effects are not known to the decision makers …

Guardar Citar Citado por 38 Artículos relacionados Las 2 versiones

[Free GPT-4]

[PDF] neurips.cc

Cal-ql: Calibrated offline rl pre-training for efficient online fine-tuning

M Nakamoto, S Zhai, A Singh… - Advances in …, 2024 - proceedings.neurips.cc

A compelling use case of offline reinforcement learning (RL) is to obtain a policy initialization
from existing datasets followed by fast online fine-tuning with limited interaction. However …

Guardar Citar Citado por 104 Artículos relacionados Las 7 versiones Versión en HTML

[Free GPT-4]

[PDF] mlr.press

Is pessimism provably efficient for offline rl?

Y **, Z Yang, Z Wang - International Conference on …, 2021 - proceedings.mlr.press

We study offline reinforcement learning (RL), which aims to learn an optimal policy based on
a dataset collected a priori. Due to the lack of further interactions with the environment …

Guardar Citar Citado por 450 Artículos relacionados Las 7 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

The statistical complexity of interactive decision making

DJ Foster, SM Kakade, J Qian, A Rakhlin - arxiv preprint arxiv:2112.13487, 2021 - arxiv.org

A fundamental challenge in interactive learning and decision making, ranging from bandit
problems to reinforcement learning, is to provide sample-efficient, adaptive learning …

Guardar Citar Citado por 205 Artículos relacionados Las 6 versiones Versión en HTML

[Free GPT-4]

[PDF] ai-plans.com

[PDF][PDF] Nash learning from human feedback

R Munos, M Valko, D Calandriello, MG Azar… - arxiv preprint arxiv …, 2023 - ai-plans.com

Large language models (LLMs)(Anil et al., 2023; Glaese et al., 2022; OpenAI, 2023; Ouyang
et al., 2022) have made remarkable strides in enhancing natural language understanding …

Guardar Citar Citado por 90 Artículos relacionados Las 5 versiones Versión en HTML

[Free GPT-4]

[PDF] mlr.press

Provably efficient reinforcement learning with linear function approximation

C **, Z Yang, Z Wang… - Conference on learning …, 2020 - proceedings.mlr.press

Abstract Modern Reinforcement Learning (RL) is commonly applied to practical problems
with an enormous number of states, where\emph {function approximation} must be deployed …

Guardar Citar Citado por 774 Artículos relacionados Las 4 versiones Versión en HTML

[LIBRO][B] Control systems and reinforcement learning

S Meyn - 2022 - books.google.com

A high school student can create deep Q-learning code to control her robot, without any
understanding of the meaning of'deep'or'Q', or why the code sometimes fails. This book is …

Guardar Citar Citado por 158 Artículos relacionados Las 3 versiones Búsqueda de bibliotecas

[Free GPT-4]

[PDF] mlr.press

Nearly minimax optimal reinforcement learning for linear mixture markov decision processes

D Zhou, Q Gu, C Szepesvari - Conference on Learning …, 2021 - proceedings.mlr.press

We study reinforcement learning (RL) with linear function approximation where the
underlying transition probability kernel of the Markov decision process (MDP) is a linear …

Guardar Citar Citado por 245 Artículos relacionados Las 7 versiones Versión en HTML

[Free GPT-4]

[PDF] mlr.press

When is partially observable reinforcement learning not scary?

Q Liu, A Chung, C Szepesvári… - Conference on Learning …, 2022 - proceedings.mlr.press

Partial observability is ubiquitous in applications of Reinforcement Learning (RL), in which
agents learn to make a sequence of decisions despite lacking complete information about …

Guardar Citar Citado por 112 Artículos relacionados Las 7 versiones Versión en HTML

Crear alerta

Citar

Búsqueda avanzada

Guardado en Mi biblioteca

Bandit algorithms

[HTML][HTML] Explainable AI for operational research: A defining framework, methods, applications, and a research agenda

Causal inference in the social sciences

Cal-ql: Calibrated offline rl pre-training for efficient online fine-tuning

Is pessimism provably efficient for offline rl?

The statistical complexity of interactive decision making

[PDF][PDF] Nash learning from human feedback

Provably efficient reinforcement learning with linear function approximation

[LIBRO][B] Control systems and reinforcement learning

Nearly minimax optimal reinforcement learning for linear mixture markov decision processes

When is partially observable reinforcement learning not scary?