Semantic segmentation-based building footprint extraction using very high-resolution satellite images and multi-source GIS data W Li, C He, J Fang, J Zheng, H Fu, L Yu Remote Sensing 11 (4), 403, 2019 | 258 | 2019 |
Turbotransformers: an efficient gpu serving system for transformer models J Fang, Y Yu, C Zhao, J Zhou Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of …, 2021 | 156 | 2021 |
Colossal-ai: A unified deep learning system for large-scale parallel training S Li, H Liu, Z Bian, J Fang, H Huang, Y Liu, B Wang, Y You Proceedings of the 52nd International Conference on Parallel Processing, 766-775, 2023 | 133 | 2023 |
swdnn: A library for accelerating deep learning applications on sunway taihulight J Fang, H Fu, W Zhao, B Chen, W Zheng, G Yang 2017 IEEE international parallel and distributed processing symposium (IPDPS …, 2017 | 86 | 2017 |
Semantic segmentation based building extraction method using multi-source gis map datasets and satellite imagery W Li, C He, J Fang, H Fu Proceedings of the IEEE conference on computer vision and pattern …, 2018 | 51 | 2018 |
Refactoring and optimizing the community atmosphere model (CAM) on the sunway taihulight supercomputer H Fu, J Liao, W Xue, L Wang, D Chen, L Gu, J Xu, N Ding, X Wang, C He, ... SC'16: Proceedings of the International Conference for High Performance …, 2016 | 45 | 2016 |
swcaffe: A parallel framework for accelerating deep learning applications on sunway taihulight L Li, J Fang, H Fu, J Jiang, W Zhao, C He, X You, G Yang 2018 IEEE International Conference on Cluster Computing (CLUSTER), 413-422, 2018 | 37 | 2018 |
A parallel finite-element time-domain method for transient electromagnetic simulation H Fu, Y Wang, ES Um, J Fang, T Wei, X Huang, G Yang Geophysics 80 (4), E213-E224, 2015 | 36 | 2015 |
Parallel training of pre-trained models via chunk-based dynamic memory management J Fang, Z Zhu, S Li, H Su, Y Yu, J Zhou, Y You IEEE Transactions on Parallel and Distributed Systems 34 (1), 304-315, 2022 | 35 | 2022 |
RedSync: reducing synchronization bandwidth for distributed deep learning training system J Fang, H Fu, G Yang, CJ Hsieh Journal of Parallel and Distributed Computing 133, 30-39, 2019 | 35 | 2019 |
Rocbert: Robust chinese bert with multimodal contrastive pretraining H Su, W Shi, X Shen, Z Xiao, T Ji, J Fang, J Zhou Proceedings of the 60th Annual Meeting of the Association for Computational …, 2022 | 34 | 2022 |
Fastfold: Reducing alphafold training time from 11 days to 67 hours S Cheng, X Zhao, G Lu, J Fang, Z Yu, T Zheng, R Wu, X Zhang, J Peng, ... arXiv preprint arXiv:2203.00854, 2022 | 26 | 2022 |
Optimizing convolutional neural networks on the sunway taihulight supercomputer W Zhao, H Fu, J Fang, W Zheng, L Gan, G Yang ACM Transactions on Architecture and Code Optimization (TACO) 15 (1), 1-26, 2018 | 22 | 2018 |
Parallel multiclass support vector machine for remote sensing data classification on multicore and many-core architectures W Li, H Fu, Y You, L Yu, J Fang IEEE Journal of Selected Topics in Applied Earth Observations and Remote …, 2017 | 15 | 2017 |
Loongtrain: Efficient training of long-sequence llms with head-context parallelism D Gu, P Sun, Q Hu, T Huang, X Chen, Y Xiong, G Wang, Q Chen, S Zhao, ... arXiv preprint arXiv:2406.18485, 2024 | 10 | 2024 |
USP: A Unified Sequence Parallelism Approach for Long Context Generative AI J Fang, S Zhao arXiv preprint arXiv:2405.07719, 2024 | 10 | 2024 |
A dynamic agricultural prediction system for large-scale drought assessment on the Sunway TaihuLight supercomputer X Huang, C Yu, J Fang, G Huang, S Ni, J Hall, C Zorn, X Huang, W Zhang Computers and electronics in agriculture 154, 400-410, 2018 | 10 | 2018 |
PipeFusion: Patch-level Pipeline Parallelism for Diffusion Transformers Inference J Fang, J Pan, J Wang, A Li, X Sun arXiv preprint arXiv:2405.14430, 2024 | 9* | 2024 |
Colossal-auto: Unified automation of parallelization and activation checkpoint for large-scale models Y Liu, S Li, J Fang, Y Shao, B Yao, Y You arXiv preprint arXiv:2302.02599, 2023 | 8 | 2023 |
Efficient AES implementation on Sunway TaihuLight supercomputer: A systematic approach L Li, J Fang, J Jiang, L Gan, W Zheng, H Fu, G Yang Journal of Parallel and Distributed Computing 138, 178-189, 2020 | 8 | 2020 |