FastFold: Optimizing AlphaFold Training and Inference on GPU Clusters S Cheng, X Zhao, G Lu, J Fang, T Zheng, R Wu, X Zhang, J Peng, Y You Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and …, 2024 | 30* | 2024 |
Real-time Video Generation with Pyramid Attention Broadcast X Zhao, X Jin, K Wang, Y You Proceedings of the Thirteenth International Conference on Learning …, 2024 | 15 | 2024 |
HeteGen: Heterogeneous Parallel Inference for Large Language Models on Resource-Constrained Devices X Zhao, B Jia, H Zhou, Z Liu, S Cheng, Y You Proceedings of the Seventh Annual Conference on Machine Learning and Systems, 2024 | 10* | 2024 |
DSP: Dynamic Sequence Parallelism for Multi-Dimensional Transformers X Zhao, S Cheng, C Chen, Z Zheng, Z Liu, Z Yang, Y You arXiv preprint arXiv:2403.10266, 2024 | 3 | 2024 |
AutoChunk: Automated Activation Chunk for Memory-Efficient Long Sequence Inference X Zhao, S Cheng, G Lu, J Fang, H Zhou, B Jia, Z Liu, Y You Proceedings of the Twelfth International Conference on Learning Representations, 2024 | 2 | 2024 |
Wallfacer: Guiding transformer model training out of the long-context dark forest with n-body problem Z Liu, S Wang, S Cheng, Z Zhao, Y Bai, X Zhao, J Demmel, Y You arXiv preprint arXiv:2407.00611, 2024 | 1 | 2024 |
Enhance-A-Video: Better Generated Video for Free Y Luo, X Zhao, M Chen, K Zhang, W Shao, K Wang, Z Wang, Y You arXiv preprint arXiv:2502.07508, 2025 | | 2025 |
Concerto: Automatic Communication Optimization and Scheduling for Large-Scale Deep Learning S Cheng, S Lin, L Diao, H Wu, S Wang, C Si, Z Liu, X Zhao, J Du, W Lin, ... Proceedings of the 30th ACM International Conference on Architectural …, 2025 | | 2025 |
Faster Vision Mamba is Rebuilt in Minutes via Merged Token Re-training M Shi, Y Zhou, R Yu, Z Li, Z Liang, X Zhao, X Peng, T Rajpurohit, ... arXiv preprint arXiv:2412.12496, 2024 | | 2024 |