Suivre
Karl Cobbe
Karl Cobbe
Research Scientist, OpenAI
Adresse e-mail validée de openai.com
Titre
Citée par
Citée par
Année
Training verifiers to solve math word problems
K Cobbe, V Kosaraju, M Bavarian, M Chen, H Jun, L Kaiser, M Plappert, ...
arXiv preprint arXiv:2110.14168, 2021
26142021
Webgpt: Browser-assisted question-answering with human feedback
R Nakano, J Hilton, S Balaji, J Wu, L Ouyang, C Kim, C Hesse, S Jain, ...
arXiv preprint arXiv:2112.09332, 2021
11202021
Quantifying generalization in reinforcement learning
K Cobbe, O Klimov, C Hesse, T Kim, J Schulman
International conference on machine learning, 1282-1289, 2019
7772019
Leveraging procedural generation to benchmark reinforcement learning
K Cobbe, C Hesse, J Hilton, J Schulman
International conference on machine learning, 2048-2056, 2020
6402020
Let's verify step by step
H Lightman, V Kosaraju, Y Burda, H Edwards, B Baker, T Lee, J Leike, ...
arXiv preprint arXiv:2305.20050, 2023
5602023
Event scheduling presentation in a graphical user interface environment
Y Shoham, JE Bank, K Cobbe, A Matta, M Rubin, ZI Weiner, KT Toft
US Patent 10,088,973, 2018
2302018
Phasic policy gradient
KW Cobbe, J Hilton, O Klimov, J Schulman
International Conference on Machine Learning, 2020-2027, 2021
1932021
Training verifiers to solve math word problems, 2021
K Cobbe, V Kosaraju, M Bavarian, M Chen, H Jun, L Kaiser, M Plappert, ...
URL https://arxiv. org/abs/2110.14168, 2021
1642021
Openai o1 system card
A Jaech, A Kalai, A Lerer, A Richardson, A El-Kishky, A Low, A Helyar, ...
arXiv preprint arXiv:2412.16720, 2024
272024
Measuring sample efficiency and generalization in reinforcement learning benchmarks: Neurips 2020 procgen benchmark
S Mohanty, J Poonganam, A Gaidon, A Kolobov, B Wulfe, D Chakraborty, ...
arXiv preprint arXiv:2103.15332, 2021
242021
Batch size-invariance for policy optimization
J Hilton, K Cobbe, J Schulman
Advances in Neural Information Processing Systems 35, 17086-17098, 2022
152022
Le système ne peut pas réaliser cette opération maintenant. Veuillez réessayer plus tard.
Articles 1–11