Follow
Runji Lin
Title
Cited by
Cited by
Year
Qwen technical report
J Bai, S Bai, Y Chu, Z Cui, K Dang, X Deng, Y Fan, W Ge, Y Han, F Huang, ...
arXiv preprint arXiv:2309.16609, 2023
18822023
Qwen2. 5 technical report
A Yang, B Yang, B Zhang, B Hui, B Zheng, B Yu, C Li, D Liu, F Huang, ...
arXiv preprint arXiv:2412.15115, 2024
8442024
Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
M Wen, JG Kuba, R Lin, W Zhang, Y Wen, J Wang, Y Yang
NeurIPS 2022, 2022
1982022
# instag: Instruction tagging for analyzing supervised fine-tuning of large language models
K Lu, H Yuan, Z Yuan, R Lin, J Lin, C Tan, C Zhou, J Zhou
The Twelfth International Conference on Learning Representations, 2023
752023
Routing to the expert: Efficient reward-guided ensemble of large language models
K Lu, H Yuan, R Lin, J Lin, Z Yuan, C Zhou, J Zhou
NAACL, 2023
482023
Qwen2. 5-math technical report: Toward mathematical expert model via self-improvement
A Yang, B Zhang, B Hui, B Gao, B Yu, C Li, D Liu, J Tu, J Zhou, J Lin, K Lu, ...
arXiv preprint arXiv:2409.12122, 2024
412024
Large language models play starcraft ii: Benchmarks and a chain of summarization approach
W Ma, Q Mi, X Yan, Y Wu, R Lin, H Zhang, J Wang
NeurIPS 2024, 2023
362023
Large Sequence Models for Sequential Decision-Making: A Survey
M WEN, R LIN, H WANG, Y YANG, Y WEN, L MAI, J WANG, H ZHANG, ...
Frontiers of Computer Science, 2023
332023
Online merging optimizers for boosting rewards and mitigating tax in alignment
K Lu, B Yu, F Huang, Y Fan, R Lin, C Zhou
arXiv preprint arXiv:2405.17931, 2024
152024
Contextual Transformer for Offline Meta Reinforcement Learning
R Lin, Y Li, X Feng, Z Zhang, XHW Fung, H Zhang, J Wang, Y Du, Y Yang
NeurIPS 2022 Workshop: Foundation Models for Decision Making, 2022
112022
Learn to flap: foil non-parametric path planning via deep reinforcement learning
ZP Wang, RJ Lin, ZY Zhao, X Chen, PM Guo, N Yang, ZC Wang, DX Fan
Journal of Fluid Mechanics 984, A9, 2024
102024
Scalable Model-based Policy Optimization for Decentralized Networked Systems
Y Du, C Ma, Y Liu, R Lin, H Dong, J Wang, Y Yang
IROS 2022, 2022
8*2022
Processbench: Identifying process errors in mathematical reasoning
C Zheng, Z Zhang, B Zhang, R Lin, K Lu, B Yu, D Liu, J Zhou, J Lin
arXiv preprint arXiv:2412.06559, 2024
62024
Llm critics help catch bugs in mathematics: Towards a better mathematical verifier with natural language feedback
B Gao, Z Cai, R Xu, P Wang, C Zheng, R Lin, K Lu, J Lin, C Zhou, W Xiao, ...
arXiv preprint arXiv:2406.14024, 2024
52024
Increasing the Data Rate for Reflected Optical Camera Communication Using Uniform LED Light
Z Chen, R Lin, H Duan, Y Chen, Y Yang, R Wu, L Chen
IEEE INFOCOM 2020-IEEE Conference on Computer Communications Workshops …, 2020
12020
The Lessons of Developing Process Reward Models in Mathematical Reasoning
Z Zhang, C Zheng, Y Wu, B Zhang, R Lin, B Yu, D Liu, J Zhou, J Lin
arXiv preprint arXiv:2501.07301, 2025
2025
Online Decision MetaMorphFormer: A Casual Transformer-Based Reinforcement Learning Framework of Universal Embodied Intelligence
L Ji, R Lin
arXiv preprint arXiv:2409.07341, 2024
2024
Learning Robust Communication by Adversarial Training in Networked System Control
R Lin, H Zhang
Chinese Conference on Swarm Intelligence and Cooperative Control, 605-619, 2023
2023
The system can't perform the operation now. Try again later.
Articles 1–18