Theo dõi
Chenggang Zhao
Chenggang Zhao
DeepSeek AI
Email được xác minh tại deepseek.com - Trang chủ
Tiêu đề
Trích dẫn bởi
Trích dẫn bởi
Năm
Deepseek llm: Scaling open-source language models with longtermism
X Bi, D Chen, G Chen, S Chen, D Dai, C Deng, H Ding, K Dong, Q Du, ...
arXiv preprint arXiv:2401.02954, 2024
2282024
Deepseekmoe: Towards ultimate expert specialization in mixture-of-experts language models
D Dai, C Deng, C Zhao, RX Xu, H Gao, D Chen, J Li, W Zeng, X Yu, Y Wu, ...
arXiv preprint arXiv:2401.06066, 2024
1802024
Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning
D Guo, D Yang, H Zhang, J Song, R Zhang, R Xu, Q Zhu, S Ma, P Wang, ...
arXiv preprint arXiv:2501.12948, 2025
1742025
Deepseek-v3 technical report
A Liu, B Feng, B Xue, B Wang, B Wu, C Lu, C Zhao, C Deng, C Zhang, ...
arXiv preprint arXiv:2412.19437, 2024
1452024
Deepseek-coder-v2: Breaking the barrier of closed-source models in code intelligence
Q Zhu, D Guo, Z Shao, D Yang, P Wang, R Xu, Y Wu, Y Li, H Gao, S Ma, ...
arXiv preprint arXiv:2406.11931, 2024
1342024
Deepseek-v2: A strong, economical, and efficient mixture-of-experts language model
A Liu, B Feng, B Wang, B Wang, B Liu, C Zhao, C Dengr, C Ruan, D Dai, ...
arXiv preprint arXiv:2405.04434, 2024
1322024
Deepseek-v3 technical report
AL DeepSeek-AI, B Feng, B Xue, B Wang, B Wu, C Lu, C Zhao, C Deng, ...
arXiv preprint arXiv:2412.19437, 4, 2024
102024
Auxiliary-loss-free load balancing strategy for mixture-of-experts
L Wang, H Gao, C Zhao, X Sun, D Dai
arXiv preprint arXiv:2408.15664, 2024
42024
Critique of “Planetary Normal Mode Computation: Parallel Algorithms, Performance, and Reproducibility” by SCC Team From Tsinghua University
C Zhang, C Zhao, J He, S Chen, L Zheng, K Huang, W Han, J Zhai
IEEE Transactions on Parallel and Distributed Systems 32 (11), 2631-2634, 2021
22021
Fire-Flyer AI-HPC: A Cost-Effective Software-Hardware Co-Design for Deep Learning
W An, X Bi, G Chen, S Chen, C Deng, H Ding, K Dong, Q Du, W Gao, ...
SC24: International Conference for High Performance Computing, Networking …, 2024
12024
Canvas: End-to-End Kernel Architecture Search in Neural Networks
C Zhao, G Zhang, M Gao
arXiv preprint arXiv:2304.07741, 2023
12023
Student Cluster Competition 2018, Team Tsinghua University: Reproducing performance of multi-physics simulations of the Tsunamigenic 2004 Sumatra megathrust earthquake on the …
J He, C Zhao, J Yu, X Yu, L Zheng, C Lou, S Tang, W Han, J Zhai
Parallel Computing 90, 102570, 2019
2019
Hệ thống không thể thực hiện thao tác ngay bây giờ. Hãy thử lại sau.
Bài viết 1–12