Variance-Aware Regret Bounds for Stochastic Contextual Dueling Bandits Q Di, T Jin, Y Wu, H Zhao, F Farnoud, Q Gu International Conference on Learning Representations 2024, 2023 | 11 | 2023 |
Borda regret minimization for generalized linear dueling bandits Y Wu, T Jin, H Lou, F Farnoud, Q Gu ICML2024, 2023 | 9 | 2023 |
Pessimistic nonlinear least-squares value iteration for offline reinforcement learning Q Di, H Zhao, J He, Q Gu International Conference on Learning Representations 2024, 2023 | 5 | 2023 |
Unified convergence analysis for score-based diffusion models with deterministic samplers R Li, Q Di, Q Gu arXiv preprint arXiv:2410.14237, 2024 | 2 | 2024 |
Nearly optimal algorithms for contextual dueling bandits from adversarial feedback Q Di, J He, Q Gu arXiv preprint arXiv:2404.10776, 2024 | 1 | 2024 |
Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic Shortest Path Q Di, J He, D Zhou, Q Gu International Conference on Machine Learning, 2023 | 1 | 2023 |
Relative-Translation Invariant Wasserstein Distance B Wang, Q Di, M Yin, M Wang, Q Gu, P Wei arXiv preprint arXiv:2409.02416, 2024 | | 2024 |