HET: scaling out huge embedding model training via cache-enabled distributed framework X Miao, H Zhang, Y Shi, X Nie, Z Yang, Y Tao, B Cui arXiv preprint arXiv:2112.07221, 2021 | 55 | 2021 |
Welder: Scheduling deep learning memory access via tile-graph Y Shi, Z Yang, J Xue, L Ma, Y Xia, Z Miao, Y Guo, F Yang, L Zhou 17th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2023 | 26 | 2023 |
HET-GMP: A graph-based system approach to scaling large embedding model training X Miao, Y Shi, H Zhang, X Zhang, X Nie, Z Yang, B Cui Proceedings of the 2022 International Conference on Management of Data, 470-480, 2022 | 19 | 2022 |
Sdpipe: A semi-decentralized framework for heterogeneity-aware pipeline-parallel training X Miao, Y Shi, Z Yang, B Cui, Z Jia Proceedings of the VLDB Endowment 16 (9), 2354-2363, 2023 | 13 | 2023 |
Cocktailer: Analyzing and Optimizing Dynamic Control Flow in Deep Learning C Zhang, L Ma, J Xue, Y Shi, Z Miao, F Yang, J Zhai, Z Yang, M Yang 17th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2023 | 11 | 2023 |
Ladder: Enabling Efficient {Low-Precision} Deep Learning Computing through Hardware-aware Tensor Transformation L Wang, L Ma, S Cao, Q Zhang, J Xue, Y Shi, N Zheng, Z Miao, F Yang, ... 18th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2024 | 9 | 2024 |