متابعة
Andrea Zanette
Andrea Zanette
Assistant Professor, Carnegie Mellon University
بريد إلكتروني تم التحقق منه على andrew.cmu.edu - الصفحة الرئيسية
عنوان
عدد مرات الاقتباسات
عدد مرات الاقتباسات
السنة
Tighter problem-dependent regret bounds in reinforcement learning without domain knowledge using value function bounds
A Zanette, E Brunskill
International Conference on Machine Learning, 7304-7312, 2019
3222019
Learning near optimal policies with low inherent bellman error
A Zanette, A Lazaric, M Kochenderfer, E Brunskill
International Conference on Machine Learning, 10978-10989, 2020
2592020
Frequentist regret bounds for randomized least-squares value iteration
A Zanette, D Brandfonbrener, E Brunskill, M Pirotta, A Lazaric
International Conference on Artificial Intelligence and Statistics, 1954-1964, 2020
1562020
Provable benefits of actor-critic methods for offline reinforcement learning
A Zanette, MJ Wainwright, E Brunskill
Advances in neural information processing systems 34, 13626-13640, 2021
1452021
Exponential lower bounds for batch reinforcement learning: Batch rl can be exponentially harder than online rl
A Zanette
International Conference on Machine Learning, 12287-12297, 2021
902021
Provably efficient reward-agnostic navigation with linear value iteration
A Zanette, A Lazaric, MJ Kochenderfer, E Brunskill
Advances in Neural Information Processing Systems 33, 11756-11766, 2020
702020
Cautiously optimistic policy optimization and exploration with linear function approximation
A Zanette, CA Cheng, A Agarwal
Conference on Learning Theory, 4473-4525, 2021
622021
Almost horizon-free structure-aware best policy identification with a generative model
A Zanette, MJ Kochenderfer, E Brunskill
Advances in Neural Information Processing Systems 32, 2019
412019
Limiting extrapolation in linear approximate value iteration
A Zanette, A Lazaric, MJ Kochenderfer, E Brunskill
Advances in Neural Information Processing Systems 32, 2019
402019
Robust super-level set estimation using Gaussian processes
A Zanette, J Zhang, MJ Kochenderfer
Joint European Conference on Machine Learning and Knowledge Discovery in …, 2018
402018
Design of experiments for stochastic contextual linear bandits
A Zanette, K Dong, JN Lee, E Brunskill
Advances in Neural Information Processing Systems 34, 22720-22731, 2021
312021
Archer: Training language model agents via hierarchical multi-turn rl
Y Zhou, A Zanette, J Pan, S Levine, A Kumar
arXiv preprint arXiv:2402.19446, 2024
272024
Problem dependent reinforcement learning bounds which can identify bandit structure in mdps
A Zanette, E Brunskill
International Conference on Machine Learning, 5747-5755, 2018
242018
When is realizability sufficient for off-policy reinforcement learning?
A Zanette
International Conference on Machine Learning, 40637-40668, 2023
192023
Bellman residual orthogonalization for offline reinforcement learning
A Zanette, MJ Wainwright
Advances in Neural Information Processing Systems 35, 3137-3151, 2022
112022
Policy finetuning in reinforcement learning via design of experiments using offline data
R Zhang, A Zanette
Advances in Neural Information Processing Systems 36, 2024
82024
Information directed reinforcement learning
A Zanette, R Sarkar
Tech. Rep., Technical report, Technical report, 2017
72017
Stabilizing q-learning with linear architectures for provable efficient learning
A Zanette, M Wainwright
International Conference on Machine Learning, 25920-25954, 2022
62022
Accelerating Best-of-N via Speculative Rejection
R Zhang, M Haider, M Yin, J Qiu, M Wang, P Bartlett, A Zanette
2nd Workshop on Advancing Neural Network Training: Computational Efficiency …, 0
6
Fast best-of-n decoding via speculative rejection
H Sun, M Haider, R Zhang, H Yang, J Qiu, M Yin, M Wang, P Bartlett, ...
arXiv preprint arXiv:2410.20290, 2024
32024
يتعذر على النظام إجراء العملية في الوقت الحالي. عاود المحاولة لاحقًا.
مقالات 1–20