フォロー
Yash Chandak
Yash Chandak
Postdoctoral Scholar, Stanford University
確認したメール アドレス: stanford.edu - ホームページ
タイトル
引用先
引用先
Learning action representations for reinforcement learning
Y Chandak, G Theocharous, J Kostas, S Jordan, P Thomas
International conference on machine learning, 941-950, 2019
2122019
Evaluating the performance of reinforcement learning algorithms
S Jordan, Y Chandak, D Cohen, M Zhang, P Thomas
International Conference on Machine Learning, 4962-4973, 2020
792020
Optimizing for the future in non-stationary mdps
Y Chandak, G Theocharous, S Shankar, M White, S Mahadevan, ...
International Conference on Machine Learning, 1414-1425, 2020
792020
Supervised pretraining can learn in-context reinforcement learning
J Lee, A Xie, A Pacchiano, Y Chandak, C Finn, O Nachum, E Brunskill
Advances in Neural Information Processing Systems 36, 43057-43083, 2023
632023
Universal off-policy evaluation
Y Chandak, S Niekum, B da Silva, E Learned-Miller, E Brunskill, ...
Advances in Neural Information Processing Systems 34, 27475-27490, 2021
572021
Understanding self-predictive learning for reinforcement learning
Y Tang, ZD Guo, PH Richemond, BA Pires, Y Chandak, R Munos, ...
International Conference on Machine Learning, 33632-33656, 2023
342023
Lifelong learning with a changing action set
Y Chandak, G Theocharous, C Nota, P Thomas
Proceedings of the AAAI Conference on Artificial Intelligence 34 (04), 3373-3380, 2020
342020
Towards safe policy improvement for non-stationary MDPs
Y Chandak, S Jordan, G Theocharous, M White, PS Thomas
Advances in Neural Information Processing Systems 33, 9156-9168, 2020
312020
The GPT Surprise: Offering Large Language Model Chat in a Massive Coding Class Reduced Engagement but Increased Adopters’ Exam Performances
A Nie, Y Chandak, M Suzara, A Malik, J Woodrow, M Peng, M Sahami, ...
OSF Preprints, 2024
162024
Reinforcement learning for strategic recommendations
G Theocharous, Y Chandak, PS Thomas, F de Nijs
arXiv preprint arXiv:2009.07346, 2020
122020
Behavior alignment via reward function optimization
D Gupta, Y Chandak, S Jordan, PS Thomas, B C da Silva
Advances in Neural Information Processing Systems 36, 52759-52791, 2023
112023
Fusion graph convolutional networks
P Vijayan, Y Chandak, MM Khapra, S Parthasarathy, B Ravindran
arXiv preprint arXiv:1805.12528, 2018
112018
Reinforcement learning when all actions are not always available
Y Chandak, G Theocharous, B Metevier, P Thomas
Proceedings of the AAAI Conference on Artificial Intelligence 34 (04), 3381-3388, 2020
102020
Off-policy evaluation for action-dependent non-stationary environments
Y Chandak, S Shankar, N Bastian, B da Silva, E Brunskill, PS Thomas
Advances in Neural Information Processing Systems 35, 9217-9232, 2022
82022
Adaptive instrument design for indirect experiments
Y Chandak, S Shankar, V Syrgkanis, E Brunskill
The Twelfth International Conference on Learning Representations, 2023
72023
High-confidence off-policy (or counterfactual) variance estimation
Y Chandak, S Shankar, PS Thomas
Proceedings of the AAAI Conference on Artificial Intelligence 35 (8), 6939-6947, 2021
72021
Factored DRO: Factored distributionally robust policies for contextual bandits
T Mu, Y Chandak, TB Hashimoto, E Brunskill
Advances in Neural Information Processing Systems 35, 8318-8331, 2022
62022
On optimizing interventions in shared autonomy
W Tan, D Koleczek, S Pradhan, N Perello, V Chettiar, V Rohra, A Rajaram, ...
Proceedings of the AAAI Conference on Artificial Intelligence 36 (5), 5341-5349, 2022
62022
Sope: Spectrum of off-policy estimators
C Yuan, Y Chandak, S Giguere, PS Thomas, S Niekum
Advances in Neural Information Processing Systems 34, 18958-18969, 2021
62021
Representations and exploration for deep reinforcement learning using singular value decomposition
Y Chandak, S Thakoor, ZD Guo, Y Tang, R Munos, W Dabney, DL Borsa
International Conference on Machine Learning, 4009-4034, 2023
52023
現在システムで処理を実行できません。しばらくしてからもう一度お試しください。
論文 1–20