Folgen
Sobhan Miryoosefi
Sobhan Miryoosefi
Princeton University | Google Research
Bestätigte E-Mail-Adresse bei google.com - Startseite
Titel
Zitiert von
Zitiert von
Jahr
Bellman Eluder dimension: New rich classes of RL problems, and sample-efficient algorithms
C Jin, Q Liu, S Miryoosefi
Advances in Neural Information Processing Systems 34, 13406-13418, 2021
2682021
Reinforcement learning with convex constraints
S Miryoosefi, K Brantley, H Daumé III, M Dudík, R Schapire
Advances in Neural Information Processing Systems 32, 14093-14102, 2019
1082019
Constrained episodic reinforcement learning in concave-convex and knapsack settings
K Brantley, M Dudik, T Lykouris, S Miryoosefi, M Simchowitz, A Slivkins, ...
Advances in Neural Information Processing Systems 33, 16315-16326, 2020
612020
Provable reinforcement learning with a short-term memory
Y Efroni, C Jin, A Krishnamurthy, S Miryoosefi
International Conference on Machine Learning, 5832-5850, 2022
442022
A simple reward-free approach to constrained reinforcement learning
S Miryoosefi, C Jin
International Conference on Machine Learning, 15666-15698, 2022
412022
Rest meets react: Self-improvement for multi-step reasoning llm agent
R Aksitov, S Miryoosefi, Z Li, D Li, S Babayan, K Kopparapu, Z Fisher, ...
arXiv preprint arXiv:2312.10003, 2023
292023
Efficient training of language models using few-shot learning
SJ Reddi, S Miryoosefi, S Karp, S Krishnan, S Kale, S Kim, S Kumar
International Conference on Machine Learning, 14553-14568, 2023
112023
Efficient Stagewise Pretraining via Progressive Subnetworks
A Panigrahi, N Saunshi, K Lyu, S Miryoosefi, S Reddi, S Kale, S Kumar
arXiv preprint arXiv:2402.05913, 2024
42024
Landscape-Aware Growing: The Power of a Little LAG
S Karp, N Saunshi, S Miryoosefi, SJ Reddi, S Kumar
arXiv preprint arXiv:2406.02469, 2024
12024
On the Inductive Bias of Stacking Towards Improving Reasoning
N Saunshi, S Karp, S Krishnan, S Miryoosefi, SJ Reddi, S Kumar
arXiv preprint arXiv:2409.19044, 2024
2024
Provable Reinforcement Learning with Constraints and Function Approximation
SSM Yoosefi
Princeton University, 2022
2022
Das System kann den Vorgang jetzt nicht ausführen. Versuchen Sie es später erneut.
Artikel 1–11