フォロー
Tiancheng Jin
Tiancheng Jin
確認したメール アドレス: usc.edu
タイトル
引用先
引用先
Learning adversarial markov decision processes with bandit feedback and unknown transition
C Jin, T Jin, H Luo, S Sra, T Yu
International Conference on Machine Learning, 4860-4869, 2020
154*2020
Deep reinforcement learning for multi-driver vehicle dispatching and repositioning problem
J Holler, R Vuorio, Z Qin, X Tang, Y Jiao, T Jin, S Singh, C Wang, J Ye
2019 IEEE International Conference on Data Mining (ICDM), 1090-1095, 2019
1382019
Simultaneously learning stochastic and adversarial episodic mdps with known transition
T Jin, H Luo
Advances in neural information processing systems 33, 16557-16566, 2020
642020
The best of both worlds: stochastic and adversarial episodic mdps with unknown transition
T Jin, L Huang, H Luo
Advances in Neural Information Processing Systems 34, 20491-20502, 2021
512021
Boosting dynamic programming with neural networks for solving np-hard problems
F Yang, T Jin, TY Liu, X Sun, J Zhang
Asian Conference on Machine Learning, 726-739, 2018
282018
Near-optimal regret for adversarial mdp with delayed bandit feedback
T Jin, T Lancewicki, H Luo, Y Mansour, A Rosenberg
Advances in Neural Information Processing Systems 35, 33469-33481, 2022
262022
Suvrit Sra, and Tiancheng Yu. Learning adversarial mdps with bandit feedback and unknown transition
C Jin, T Jin, H Luo
arXiv preprint arXiv:1912.01192, 2019
212019
Improved best-of-both-worlds guarantees for multi-armed bandits: Ftrl with general regularizers and multiple optimal arms
T Jin, J Liu, H Luo
Advances in Neural Information Processing Systems 36, 30918-30978, 2023
202023
No-regret online reinforcement learning with adversarial losses and transitions
T Jin, J Liu, C Rouyer, W Chang, CY Wei, H Luo
Advances in Neural Information Processing Systems 36, 2024
122024
Suvrit Sra, and Tiancheng Yu
C Jin, T Jin, H Luo
Learning adversarial mdps with bandit feedback and unknown transition, 2019
52019
Robust and Adaptive Online Reinforcement Learning
T Jin
University of Southern California, 2024
2024
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback
A Rosenberg, H Luo, T Jin, Y Mansour
2022
現在システムで処理を実行できません。しばらくしてからもう一度お試しください。
論文 1–12