Seuraa
Masatoshi Uehara
Masatoshi Uehara
EvolutionaryScale
Vahvistettu sähköpostiosoite verkkotunnuksessa evolutionaryscale.ai - Kotisivu
Nimike
Viittaukset
Viittaukset
Vuosi
Double reinforcement learning for efficient off-policy evaluation in markov decision processes
N Kallus, M Uehara
Journal of Machine Learning Research 21 (167), 1-63, 2020
2282020
Minimax weight and q-function learning for off-policy evaluation
M Uehara, J Huang, N Jiang
International Conference on Machine Learning, 9659-9668, 2020
2052020
Pessimistic Model-based Offline Reinforcement Learning under Partial Coverage
M Uehara, W Sun
International Conference on Learning Representations, 2022
1632022
Representation Learning for Online and Offline RL in Low-rank MDPs
M Uehara, X Zhang, W Sun
International Conference on Learning Representations, 2022
1622022
Generative adversarial nets from a density ratio estimation perspective
M Uehara, I Sato, M Suzuki, K Nakayama, Y Matsuo
arXiv preprint arXiv:1610.02920, 2016
1112016
Efficiently breaking the curse of horizon in off-policy evaluation with double reinforcement learning
N Kallus, M Uehara
Operations Research 70 (6), 3282-3302, 2022
110*2022
Mitigating covariate shift in imitation learning via offline data with partial coverage
J Chang, M Uehara, D Sreenivas, R Kidambi, W Sun
Advances in Neural Information Processing Systems 34, 965-979, 2021
1052021
A review of off-policy evaluation in reinforcement learning
M Uehara, C Shi, N Kallus
arXiv preprint arXiv:2212.06355, 2022
782022
Efficient reinforcement learning in block mdps: A model-free representation learning approach
X Zhang, Y Song, M Uehara, M Wang, A Agarwal, W Sun
International Conference on Machine Learning, 26517-26547, 2022
752022
Causal inference under unmeasured confounding with negative controls: A minimax learning approach
N Kallus, X Mao, M Uehara
arXiv preprint arXiv:2103.14029, 2021
742021
Finite sample analysis of minimax offline reinforcement learning: Completeness, fast rates and first-order efficiency
M Uehara, M Imaizumi, N Jiang, N Kallus, W Sun, T Xie
arXiv preprint arXiv:2102.02981, 2021
692021
Intrinsically efficient, stable, and bounded off-policy evaluation for reinforcement learning
N Kallus, M Uehara
Advances in Neural Information Processing Systems 32, 2019
582019
Off-policy evaluation and learning for external validity under a covariate shift
M Uehara, M Kato, S Yasui
Advances in Neural Information Processing Systems 33, 49-61, 2020
51*2020
Statistically efficient off-policy policy gradients
N Kallus, M Uehara
Proceedings of the 37th International Conference on Machine Learning, 5089-5100, 2020
502020
PAC Reinforcement Learning for Predictive State Representations
W Zhan, M Uehara, W Sun, JD Lee
International Conference on Learning Representations, 2023
492023
A minimax learning approach to off-policy evaluation in confounded partially observable markov decision processes
C Shi, M Uehara, J Huang, N Jiang
International Conference on Machine Learning, 20057-20094, 2022
442022
Localized debiased machine learning: Efficient inference on quantile treatment effects and beyond
N Kallus, X Mao, M Uehara
Journal of Machine Learning Research 25 (16), 1-59, 2024
40*2024
Provably efficient reinforcement learning in partially observable dynamical systems
M Uehara, A Sekhari, JD Lee, N Kallus, W Sun
Advances in Neural Information Processing Systems 35, 578-592, 2022
402022
Fine-tuning of continuous-time diffusion models as entropy-regularized control
M Uehara, Y Zhao, K Black, E Hajiramezanali, G Scalia, NL Diamant, ...
arXiv preprint arXiv:2402.15194, 2024
392024
Optimal off-policy evaluation from multiple logging policies
N Kallus, Y Saito, M Uehara
International Conference on Machine Learning, 5247-5256, 2021
362021
Järjestelmä ei voi suorittaa toimenpidettä nyt. Yritä myöhemmin uudelleen.
Artikkelit 1–20