Google Académico

Artículos

Académico

2 resultados (0.02 s)

Mi perfil Mi biblioteca

No-regret Shannon entropy regularized neural contextual bandit online learning for robotic gras**

Buscar en artículos que citan

[Free GPT-4]
[DeepSeek]

[PDF] mdpi.com

Maximum entropy exploration in contextual bandits with neural networks and energy based models

A Elwood, M Leonardi, A Mohamed, A Rozza - Entropy, 2023 - mdpi.com

Contextual bandits can solve a huge range of real-world problems. However, current
popular algorithms to solve them either rely on linear models or unreliable uncertainty …

Guardar Citar Citado por 1 Artículos relacionados Las 9 versiones En caché

[Free GPT-4]
[DeepSeek]

[PDF] snu.ac.kr

Dual Variable Actor-Critic for Adaptive Safe Reinforcement Learning

J Lee, J Heo, D Kim, G Lee, S Oh - 2023 IEEE/RSJ International …, 2023 - ieeexplore.ieee.org

Satisfying safety constraints in reinforcement learning (RL) is an important issue, especially
in real-world applications. Many studies have approached safe RL with the Lagrangian …

Guardar Citar Citado por 1 Artículos relacionados Las 2 versiones

Crear alerta

Citar

Búsqueda avanzada

Guardado en Mi biblioteca

No-regret Shannon entropy regularized neural contextual bandit online learning for robotic gras**

Maximum entropy exploration in contextual bandits with neural networks and energy based models

Dual Variable Actor-Critic for Adaptive Safe Reinforcement Learning