Introducing mpt-7b: A new standard for open-source, commercially usable llms MosaicML NLP Team Accessed, 2023 | 276* | 2023 |
Trajectory diversity for zero-shot coordination A Lupu, B Cui, H Hu, J Foerster International conference on machine learning, 7204-7213, 2021 | 111 | 2021 |
Compilergym: Robust, performant compiler optimization environments for ai research C Cummins, B Wasti, J Guo, B Cui, J Ansel, S Gomez, S Jain, J Liu, ... 2022 IEEE/ACM International Symposium on Code Generation and Optimization …, 2022 | 80 | 2022 |
Off-belief learning H Hu, A Lerer, B Cui, L Pineda, N Brown, J Foerster International Conference on Machine Learning, 4369-4379, 2021 | 70 | 2021 |
K-level reasoning for zero-shot coordination in hanabi B Cui, H Hu, L Pineda, J Foerster Advances in Neural Information Processing Systems 34, 8215-8228, 2021 | 37 | 2021 |
Adversarial diversity in hanabi B Cui, A Lupu, S Sokota, H Hu, DJ Wu, JN Foerster The Eleventh International Conference on Learning Representations, 2023 | 17 | 2023 |
Control-aware representations for model-based reinforcement learning B Cui, Y Chow, M Ghavamzadeh arXiv preprint arXiv:2006.13408, 2020 | 17 | 2020 |
Variational model-based policy optimization Y Chow, B Cui, MK Ryu, M Ghavamzadeh arXiv preprint arXiv:2006.05443, 2020 | 13 | 2020 |
Critique-out-loud reward models Z Ankner, M Paul, B Cui, JD Chang, P Ammanabrolu arXiv preprint arXiv:2408.11791, 2024 | 12 | 2024 |
Learning space partitions for path planning K Yang, T Zhang, C Cummins, B Cui, B Steiner, L Wang, JE Gonzalez, ... Advances in Neural Information Processing Systems 34, 378-391, 2021 | 11 | 2021 |
Critique-out-loud reward models, 2024 Z Ankner, M Paul, B Cui, JD Chang, P Ammanabrolu URL https://arxiv. org/abs/2408.11791, 0 | 6 | |
Off-team learning B Cui, H Hu, A Lupu, S Sokota, J Foerster Advances in Neural Information Processing Systems 35, 15407-15419, 2022 | 1 | 2022 |
Terahertz waveguide with a negative effective index of refraction measured using time domain techniques S Pandey, B Gupta, B Cui, D Schurig, A Nahata 2016 41st International Conference on Infrared, Millimeter, and Terahertz …, 2016 | 1 | 2016 |
Self-explaining deviations for coordination H Hu, S Sokota, D Wu, A Bakhtin, A Lupu, B Cui, J Foerster Advances in Neural Information Processing Systems 35, 38400-38410, 2022 | | 2022 |
Community Infrastructure for Applying Reinforcement Learning to Compiler Optimizations C Cummins, B Wasti, J Guo, B Cui, J Ansel, S Gomez, S Jain, J Liu, ... | | |