עקוב אחר
Yaqi Duan
Yaqi Duan
Department of Technology, Operations and Statistics at NYU Stern
כתובת אימייל מאומתת בדומיין stern.nyu.edu - דף הבית
כותרת
צוטט על ידי
צוטט על ידי
שנה
Minimax-optimal off-policy evaluation with linear function approximation
Y Duan, M Wang
International Conference on Machine Learning, 2701-2709, 2020
1712020
Near-optimal offline reinforcement learning with linear representation: Leveraging variance information with pessimism
M Yin, Y Duan, M Wang, YX Wang
International Conference on Learning Representations, 2022
842022
State aggregation learning from Markov transition data
Y Duan, T Ke, M Wang
Advances in Neural Information Processing Systems, 4486-4495, 2019
662019
Risk bounds and Rademacher complexity in batch reinforcement learning
Y Duan, C Jin, Z Li
International Conference on Machine Learning, 2892-2902, 2021
602021
Bootstrapping fitted Q-evaluation for off-policy inference
B Hao, X Ji, Y Duan, H Lu, C Szepesvari, M Wang
International Conference on Machine Learning, 4074-4084, 2021
472021
Optimal policy evaluation using kernel-based temporal difference methods
Y Duan, M Wang, MJ Wainwright
The Annals of Statistics 52 (5), 1927-1952, 2024
432024
Sparse feature selection makes batch reinforcement learning more sample efficient
B Hao, Y Duan, T Lattimore, C Szepesvári, M Wang
International Conference on Machine Learning, 4063-4073, 2021
382021
Adaptive and robust multi-task learning
Y Duan, K Wang
The Annals of Statistics 51 (5), 2015-2039, 2023
352023
Learning low-dimensional state embeddings and metastable clusters from time series data
Y Sun, Y Duan, H Gong, M Wang
Advances in Neural Information Processing Systems, 4561-4570, 2019
212019
Bootstrapping statistical inference for off-policy evaluation
B Hao, X Ji, Y Duan, H Lu, C Szepesvári, M Wang
arXiv preprint, arXiv:2102.03607, 2021
172021
Learning good state and action representations via tensor decomposition
C Ni, AR Zhang, Y Duan, M Wang
2021 IEEE International Symposium on Information Theory (ISIT), 1682-1687, 2021
122021
Adaptive low-nonnegative-rank approximation for state aggregation of Markov chains
Y Duan, M Wang, Z Wen, Y Yuan
SIAM Journal on Matrix Analysis and Applications 41 (1), 244-278, 2020
92020
Learning good state and action representations for Markov decision process via tensor decomposition
C Ni, Y Duan, M Dahleh, M Wang, AR Zhang
Journal of Machine Learning Research 24 (115), 1-53, 2023
42023
Policy evaluation from a single path: Multi-step methods, mixing and mis-specification
Y Duan, MJ Wainwright
arXiv preprint, arXiv:2211.03899, 2022
42022
Taming "data-hungry" reinforcement learning? Stability in continuous state-action spaces
Y Duan, MJ Wainwright
Advances in Neural Information Processing Systems, 2024
32024
A finite-sample analysis of multi-step temporal difference estimates
Y Duan, MJ Wainwright
Learning for Dynamics and Control Conference, 612-624, 2023
32023
PILAF: Optimal human preference sampling for reward modeling
Y Feng, A Kwiatkowski, K Zheng, J Kempe, Y Duan
arXiv preprint, arXiv:2502.04270, 2025
12025
Localized exploration in contextual dynamic pricing achieves dimension-free regret
J Chai, Y Duan, J Fan, K Wang
arXiv preprint, arXiv:2412.19252, 2024
12024
המערכת אינה יכולה לבצע את הפעולה כעת. נסה שוב מאוחר יותר.
מאמרים 1–18