Segueix
Yao Liu
Yao Liu
Amazon
Correu electrònic verificat a stanford.edu - Pàgina d'inici
Títol
Citada per
Citada per
Any
Provably good batch reinforcement learning without great exploration
Y Liu, A Swaminathan, A Agarwal, E Brunskill
Advances in Neural Information Processing Systems 33, 1264–1274, 2020
2312020
Off-Policy Policy Gradient with Stationary Distribution Correction
Y Liu, A Swaminathan, A Agarwal, E Brunskill
Proceedings of The 35th Uncertainty in Artificial Intelligence Conference …, 2019
188*2019
Representation balancing mdps for off-policy policy evaluation
Y Liu, O Gottesman, A Raghu, M Komorowski, A Faisal, F Doshi-Velez, ...
Advances in Neural Information Processing Systems 31, 2644--2653, 2018
852018
Interpretable off-policy evaluation in reinforcement learning by highlighting influential transitions
O Gottesman, J Futoma, Y Liu, S Parbhoo, L Celi, E Brunskill, ...
International Conference on Machine Learning, 3658-3667, 2020
692020
Behaviour policy estimation in off-policy policy evaluation: Calibration matters
A Raghu, O Gottesman, Y Liu, M Komorowski, A Faisal, F Doshi-Velez, ...
arXiv preprint arXiv:1807.01066, 2018
472018
Understanding the curse of horizon in off-policy evaluation via conditional importance sampling
Y Liu, PL Bacon, E Brunskill
International Conference on Machine Learning, 6184-6193, 2020
442020
Combining parametric and nonparametric models for off-policy evaluation
O Gottesman, Y Liu, S Sussex, E Brunskill, F Doshi-Velez
In International Conference on Machine Learning, 2366-2375, 2019
362019
When Simple Exploration is Sample Efficient: Identifying Sufficient Conditions for Random Exploration to Yield PAC RL Algorithms
Y Liu, E Brunskill
The 14th European Workshop on Reinforcement Learning, 2018
272018
Pac continuous state online multitask reinforcement learning with identification
Y Liu, Z Guo, E Brunskill
Proceedings of the 2016 International Conference on Autonomous Agents …, 2016
222016
Reinforcement learning tutor better supported lower performers in a math task
S Ruan, A Nie, W Steenbergen, J He, JQ Zhang, M Guo, Y Liu, ...
Machine Learning 113 (5), 3023-3048, 2024
202024
Tail: Task-specific adapters for imitation learning with large pretrained models
Z Liu, J Zhang, K Asadi, Y Liu, D Zhao, S Sabach, R Fakoor
arXiv preprint arXiv:2310.05905, 2023
172023
All-action policy gradient methods: A numerical integration approach
B Petit, L Amdahl-Culleton, Y Liu, J Smith, PL Bacon
arXiv preprint arXiv:1910.09093, 2019
92019
Td convergence: An optimization perspective
K Asadi, S Sabach, Y Liu, O Gottesman, R Fakoor
Advances in Neural Information Processing Systems 36, 49169-49186, 2023
72023
Budgeting counterfactual for offline rl
Y Liu, P Chaudhari, R Fakoor
Advances in Neural Information Processing Systems 36, 5729-5751, 2023
52023
Offline policy optimization with eligible actions
Y Liu, Y Flet-Berliac, E Brunskill
Uncertainty in Artificial Intelligence, 1253-1263, 2022
42022
Nonlinear dimensionality reduction by local orthogonality preserving alignment
T Lin, Y Liu, B Wang, LW Wang, HB Zha
Journal of Computer Science and Technology 31 (3), 512-524, 2016
4*2016
Agentoccam: A simple yet strong baseline for llm-based web agents
K Yang, Y Liu, S Chaudhary, R Fakoor, P Chaudhari, G Karypis, ...
arXiv preprint arXiv:2410.13825, 2024
32024
EXTRACT: Efficient Policy Learning by Extracting Transferable Robot Skills from Offline Data
J Zhang, M Heo, Z Liu, E Biyik, JJ Lim, Y Liu, R Fakoor
arXiv preprint arXiv:2406.17768, 2024
22024
Provably sample-efficient RL with side information about latent dynamics
Y Liu, D Misra, M Dudík, RE Schapire
Advances in Neural Information Processing Systems 35, 33482-33493, 2022
22022
Learning the target network in function space
K Asadi, Y Liu, S Sabach, M Yin, R Fakoor
arXiv preprint arXiv:2406.01838, 2024
12024
En aquests moments el sistema no pot dur a terme l'operació. Torneu-ho a provar més tard.
Articles 1–20