Следене
Aviral Kumar
Aviral Kumar
Потвърден имейл адрес: andrew.cmu.edu - Начална страница
Заглавие
Позовавания
Позовавания
Година
Gemini: a family of highly capable multimodal models
G Team, R Anil, S Borgeaud, JB Alayrac, J Yu, R Soricut, J Schalkwyk, ...
arXiv preprint arXiv:2312.11805, 2023
33082023
Offline reinforcement learning: Tutorial, review, and perspectives on open problems
S Levine, A Kumar, G Tucker, J Fu
arXiv preprint arXiv:2005.01643, 2020
22342020
Conservative q-learning for offline reinforcement learning
A Kumar, A Zhou, G Tucker, S Levine
Advances in Neural Information Processing Systems 33, 1179-1191, 2020
21302020
D4rl: Datasets for deep data-driven reinforcement learning
J Fu, A Kumar, O Nachum, G Tucker, S Levine
arXiv preprint arXiv:2004.07219, 2020
13342020
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
A Kumar, J Fu, G Tucker, S Levine
NeuRIPS 2019, arXiv:1906.00949, 2019
12202019
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
M Reid, N Savinov, D Teplyashin, D Lepikhin, T Lillicrap, J Alayrac, ...
arXiv preprint arXiv:2403.05530, 2024
11982024
Advantage-weighted regression: Simple and scalable off-policy reinforcement learning
XB Peng, A Kumar, G Zhang, S Levine
arXiv preprint arXiv:1910.00177, 2019
5902019
Combo: Conservative offline model-based policy optimization
T Yu, A Kumar, R Rafailov, A Rajeswaran, S Levine, C Finn
Advances in neural information processing systems 34, 28954-28967, 2021
4692021
Trainable calibration measures for neural networks from kernel mean embeddings
A Kumar, S Sarawagi, U Jain
International Conference on Machine Learning, 2805-2814, 2018
3312018
Graph Normalizing Flows
J Liu, A Kumar, J Ba, J Kiros, K Swersky
NeurIPS 2019, arxiv:1905.13177, 2019
313*2019
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
C Snell, J Lee, K Xu, A Kumar
arXiv preprint arXiv:2408.03314, 2024
2292024
Opal: Offline primitive discovery for accelerating offline reinforcement learning
A Ajay, A Kumar, P Agrawal, S Levine, O Nachum
arXiv preprint arXiv:2010.13611, 2020
2002020
Diagnosing Bottlenecks in Deep Q-learning Algorithms
J Fu, A Kumar, M Soh, S Levine
International Conference on Machine Learning (ICML) 2019, https://arxiv.org …, 2019
1722019
Conservative safety critics for exploration
H Bharadhwaj, A Kumar, N Rhinehart, S Levine, F Shkurti, A Garg
arXiv preprint arXiv:2010.14497, 2020
1572020
When should we prefer offline reinforcement learning over behavioral cloning?
A Kumar, J Hong, A Singh, S Levine
arXiv preprint arXiv:2204.05618, 2022
154*2022
Why generalization in rl is difficult: Epistemic pomdps and implicit partial observability
D Ghosh, J Rahme, A Kumar, A Zhang, RP Adams, S Levine
Advances in neural information processing systems 34, 25502-25515, 2021
1362021
Implicit under-parameterization inhibits data-efficient deep reinforcement learning
A Kumar, R Agarwal, D Ghosh, S Levine
arXiv preprint arXiv:2010.14498, 2020
1242020
Discor: Corrective feedback in reinforcement learning via distribution correction
A Kumar, A Gupta, S Levine
Advances in Neural Information Processing Systems 33, 18560-18572, 2020
1232020
Cal-ql: Calibrated offline rl pre-training for efficient online fine-tuning
M Nakamoto, S Zhai, A Singh, M Sobol Mark, Y Ma, C Finn, A Kumar, ...
Advances in Neural Information Processing Systems 36, 2024
1212024
Cog: Connecting new skills to past experience with offline reinforcement learning
A Singh, A Yu, J Yang, J Zhang, A Kumar, S Levine
arXiv preprint arXiv:2010.14500, 2020
1162020
Системата не може да изпълни операцията сега. Опитайте отново по-късно.
Статии 1–20