Segui
Stephen McAleer
Stephen McAleer
OpenAI
Email verificata su openai.com - Home page
Titolo
Citata da
Citata da
Anno
Highly accurate machine fault diagnosis using deep transfer learning
S Shao, S McAleer, R Yan, P Baldi
IEEE Transactions on Industrial Informatics 15 (4), 2446-2455, 2018
13202018
Language Models can Solve Computer Tasks
G Kim, P Baldi, S McAleer
Neural Information Processing Systems (NeurIPS), 2023
3102023
Llemma: An Open Language Model for Mathematics
Z Azerbayev, H Schoelkopf, K Paster, M Dos Santos, S McAleer, AQ Jiang, ...
International Conference on Learning Representations (ICLR), 2023
2552023
Mastering the game of Stratego with model-free multiagent reinforcement learning
J Perolat, B De Vylder, D Hennes, E Tarassov, F Strub, V de Boer, ...
Science 378 (6623), 990-996, 2022
2462022
Solving the Rubik’s cube with deep reinforcement learning and search
F Agostinelli*, S McAleer*, A Shmakov*, P Baldi
Nature Machine Intelligence 1 (8), 356-363, 2019
2442019
AI Alignment: A Comprehensive Survey
J Ji, T Qiu, B Chen, B Zhang, H Lou, K Wang, Y Duan, Z He, J Zhou, ...
arXiv preprint arXiv:2310.19852, 2023
2262023
Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning
Y Chen, Y Yang, T Wu, S Wang, X Feng, J Jiang, SM McAleer, H Dong, ...
36th Conference on Neural Information Processing Systems (NeurIPS 2022 …, 2022
1042022
Solving the Rubik's Cube with Approximate Policy Iteration
S McAleer*, F Agostinelli*, A Shmakov*, P Baldi
International Conference on Learning Representations (ICLR), 2018
103*2018
Pipeline PSRO: A scalable approach for finding approximate nash equilibria in large games
S McAleer*, J Lanier*, R Fox, P Baldi
34th Conference on Neural Information Processing Systems (NeurIPS), 2020
892020
Alphazero-like tree-search can guide large language model decoding and training
Z Wan, X Feng, M Wen, SM McAleer, Y Wen, W Zhang, J Wang
Forty-first International Conference on Machine Learning, 2024
862024
Evolutionary reinforcement learning for sample-efficient multiagent coordination
S Majumdar, S Khadka, S Miret, S McAleer, K Tumer
International Conference on Machine Learning (ICML), 2020
752020
XDO: A double oracle algorithm for extensive-form games
S McAleer, J Lanier, P Baldi, R Fox
Advances in Neural Information Processing Systems (NeurIPS), 2021
622021
Independent Natural Policy Gradient Always Converges in Markov Potential Games
R Fox, S McAleer, W Overman, I Panageas
AISTATS 2022, 2021
582021
Neural auto-curricula in two-player zero-sum games
X Feng, O Slumbers, Z Wan, B Liu, S McAleer, Y Wen, J Wang, Y Yang
Advances in Neural Information Processing Systems (NeurIPS), 2021
52*2021
Online Double Oracle
LC Dinh, Y Yang, S McAleer, NP Nieves, O Slumbers, Z Tian, DH Mguni, ...
Transactions on Machine Learning Research, 2021
362021
Confronting Reward Model Overoptimization with Constrained RLHF
T Moskovitz, AK Singh, DJ Strouse, T Sandholm, R Salakhutdinov, ...
International Conference on Learning Representations (ICLR) spotlight, 2023
342023
Deep-learning-based reconstruction of the neutrino direction and energy for in-ice radio detectors
C Glaser, S McAleer, S Stjärnholm, P Baldi, SW Barwick
Astroparticle Physics 145, 102781, 2023
32*2023
White Paper: ARIANNA-200 high energy neutrino telescope
A Anker, P Baldi, SW Barwick, D Bergman, H Bernhoff, DZ Besson, ...
arXiv preprint arXiv:2004.09841, 2020
302020
Tree search for language model agents
JY Koh, S McAleer, D Fried, R Salakhutdinov
arXiv preprint arXiv:2407.01476, 2024
292024
Toward Optimal Policy Population Growth in Two-Player Zero-Sum Games
S McAleer, JB Lanier, K Wang, P Baldi, R Fox, T Sandholm
International Conference on Learning Representations (ICLR), 2022
29*2022
Il sistema al momento non può eseguire l'operazione. Riprova più tardi.
Articoli 1–20