Seuraa
Kefan Dong
Kefan Dong
Vahvistettu sähköpostiosoite verkkotunnuksessa stanford.edu - Kotisivu
Nimike
Viittaukset
Viittaukset
Vuosi
Q-learning with ucb exploration is sample efficient for infinite-horizon mdp
K Dong, Y Wang, X Chen, L Wang
International Conference on Learning Representations, 2019
1242019
Exploration via hindsight goal generation
Z Ren, K Dong, Y Zhou, Q Liu, J Peng
Advances in Neural Information Processing Systems 32, 2019
992019
Root-n-regret for learning in markov decision processes with function approximation and low bellman rank
K Dong, J Peng, Y Wang, Y Zhou
Conference on Learning Theory, 1554-1557, 2020
492020
Provable model-based nonlinear bandit and reinforcement learning: Shelve optimism, embrace virtual curvature
K Dong, J Yang, T Ma
Advances in Neural Information Processing Systems 34, 26168-26182, 2021
442021
On the expressivity of neural networks for deep reinforcement learning
K Dong, Y Luo, T Yu, C Finn, T Ma
International conference on machine learning, 2627-2637, 2020
362020
Design of experiments for stochastic contextual linear bandits
A Zanette, K Dong, JN Lee, E Brunskill
Advances in Neural Information Processing Systems 34, 22720-22731, 2021
322021
First steps toward understanding the extrapolation of nonlinear models to unseen domains
K Dong, T Ma
arXiv preprint arXiv:2211.11719, 2022
222022
Multinomial logit bandit with low switching cost
K Dong, Y Li, Q Zhang, Y Zhou
International Conference on Machine Learning, 2607-2615, 2020
212020
Beyond ntk with vanilla gradient descent: A mean-field analysis of neural networks with polynomial width, samples, and time
A Mahankali, H Zhang, K Dong, M Glasgow, T Ma
Advances in Neural Information Processing Systems 36, 57367-57480, 2023
142023
Asymptotic instance-optimal algorithms for interactive decision making
K Dong, T Ma
arXiv preprint arXiv:2206.02326, 2022
142022
Model-based offline reinforcement learning with local misspecification
K Dong, Y Flet-Berliac, A Nie, E Brunskill
Proceedings of the AAAI Conference on Artificial Intelligence 37 (6), 7423-7431, 2023
42023
Toward L_∞ Recovery of Nonlinear Functions: A Polynomial Sample Complexity Bound for Gaussian Random Fields
K Dong, T Ma
The Thirty Sixth Annual Conference on Learning Theory, 2877-2918, 2023
32023
Formal Theorem Proving by Rewarding LLMs to Decompose Proofs Hierarchically
K Dong, A Mahankali, T Ma
arXiv preprint arXiv:2411.01829, 2024
22024
Refined analysis of fpl for adversarial markov decision processes
Y Wang, K Dong
arXiv preprint arXiv:2008.09251, 2020
22020
STP: Self-play LLM Theorem Provers with Iterative Conjecturing and Proving
K Dong, T Ma
arXiv preprint arXiv:2502.00212, 2025
2025
Järjestelmä ei voi suorittaa toimenpidettä nyt. Yritä myöhemmin uudelleen.
Artikkelit 1–15