Подписаться
Bo Liu (Benjamin Liu)
Bo Liu (Benjamin Liu)
PhD student, National University of Singapore | Prev DeepSeek, Peking University
Подтвержден адрес электронной почты в домене comp.nus.edu.sg - Главная страница
Название
Процитировано
Процитировано
Год
DeepSeek-LLM: Scaling open-source language models with longtermism
X Bi, D Chen, G Chen, S Chen, D Dai, C Deng, H Ding, K Dong, Q Du, ...
arXiv preprint arXiv:2401.02954, 2024
2182024
DeepSeek-VL: Towards real-world vision-language understanding
H Lu, W Liu, B Zhang, B Wang, K Dong, B Liu, J Sun, T Ren, Z Li, Y Sun, ...
arXiv preprint arXiv:2403.05525, 2024
2122024
DeepSeek-V2: A strong, economical, and efficient mixture-of-experts language model
A Liu, B Feng, B Wang, B Wang, B Liu, C Zhao, C Dengr, C Ruan, D Dai, ...
arXiv preprint arXiv:2405.04434, 2024
1322024
Learning correlated communication topology in multi-agent reinforcement learning
Y Du, B Liu, V Moens, Z Liu, Z Ren, J Wang, X Chen, H Zhang
Twentieth International Conference on Autonomous Agents and MultiAgent …, 2021
742021
Envpool: A highly parallel reinforcement learning environment execution engine
J Weng, M Lin, S Huang, B Liu, D Makoviichuk, V Makoviychuk, Z Liu, ...
Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS 2022), 2022
582022
Neural auto-curricula in two-player zero-sum games
X Feng, O Slumbers, Z Wan, B Liu, S McAleer, Y Wen, J Wang, Y Yang
Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS 2021), 2021
50*2021
Deepseek-prover: Advancing theorem proving in llms through large-scale synthetic data
H Xin, D Guo, Z Shao, Z Ren, Q Zhu, B Liu, C Ruan, W Li, X Liang
arXiv preprint arXiv:2405.14333, 2024
362024
DeepSeek-Prover-V1. 5: Harnessing proof assistant feedback for reinforcement learning and monte-carlo tree search
H Xin, ZZ Ren, J Song, Z Shao, W Zhao, H Wang, B Liu, L Zhang, X Lu, ...
arXiv preprint arXiv:2408.08152, 2024
32*2024
Grasp multiple objects with one hand
Y Li, B Liu, Y Geng, P Li, Y Yang, Y Zhu, T Liu, S Huang
IEEE Robotics and Automation Letters (RA-L), 2024
202024
Torchopt: An efficient library for differentiable optimization
J Ren*, X Feng*, B Liu*, X Pan*, Y Fu, L Mai, Y Yang
Journal of Machine Learning Research (JMLR), 2023
162023
A theoretical understanding of gradient bias in meta-reinforcement learning
B Liu*, X Feng*, J Ren, L Mai, R Zhu, H Zhang, J Wang, Y Yang
Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS 2022), 2022
14*2022
Natural language reinforcement learning
X Feng, Z Wan, H Fu, B Liu, M Yang, GA Koushik, Z Hu, Y Wen, J Wang
arXiv preprint arXiv:2411.14251, 2024
32024
В данный момент система не может выполнить эту операцию. Повторите попытку позднее.
Статьи 1–12