Stebėti
Esther Derman
Esther Derman
Mila - Quebec AI Institute
Patvirtintas el. paštas mila.quebec - Pagrindinis puslapis
Pavadinimas
Cituota
Cituota
Metai
Soft-Robust Actor-Critic Policy-Gradient
E Derman, DJ Mankowitz, TA Mann, S Mannor
AUAI press for Association for Uncertainty in Artificial Intelligence, 208-218, 2018
672018
A bayesian approach to robust reinforcement learning
E Derman, D Mankowitz, T Mann, S Mannor
Uncertainty in Artificial Intelligence, 648-658, 2020
572020
Twice regularized MDPs and the equivalence between robustness and regularization
E Derman, M Geist, S Mannor
Advances in Neural Information Processing Systems 34, 22274-22287, 2021
502021
Distributional Robustness and Regularization in Reinforcement Learning
E Derman, S Mannor
ICML Workshop on Theoretical Foundations of Reinforcement Learning, 2020
482020
Acting in Delayed Environments with Non-Stationary Markov Policies
E Derman, G Dalal, S Mannor
International Conference on Learning Representations (ICLR), 2021
402021
Policy Gradient for Rectangular Robust Markov Decision Processes
N Kumar, E Derman, M Geist, K Levy, S Mannor
Advances in Neural Information Processing Systems 36, 2024
362024
Clustering and model selection via penalized likelihood for different-sized categorical data vectors
E Derman, EL Pennec
arXiv preprint arXiv:1709.02294, 2017
32017
Solving non-rectangular reward-robust MDPs via frequency regularization
U Gadot, E Derman, N Kumar, MM Elfatihi, K Levy, S Mannor
Proceedings of the AAAI Conference on Artificial Intelligence 38 (19), 21090 …, 2024
22024
Twice regularized Markov decision processes: The equivalence between robustness and regularization
E Derman, Y Men, M Geist, S Mannor
arXiv preprint arXiv:2303.06654, 2023
2*2023
Tree Search-Based Policy Optimization under Stochastic Execution Delay
D Valensi, E Derman, S Mannor, G Dalal
The Twelfth International Conference on Learning Representations, 2023
12023
Q-learning for Quantile MDPs: A Decomposition, Performance, and Convergence Analysis
JL Hau, E Delage, E Derman, M Ghavamzadeh, M Petrik
arXiv preprint arXiv:2410.24128, 2024
2024
Targeted Uncertainty Reduction in Robust MDPs
U Gadot, K Wang, E Derman, N Kumar, K Levy, S Mannor
NeurIPS 2023 Workshop on Generalization in Planning, 0
Sistema negali atlikti operacijos. Bandykite vėliau dar kartą.
Straipsniai 1–12