Towards efficient deep neural network training by FPGA-based batch-level parallelism C Luo, MK Sit, H Fan, S Liu, W Luk, C Guo Journal of Semiconductors 41 (2), 022403, 2020 | 67 | 2020 |
Ekko: A {Large-Scale} Deep Learning Recommender System with {Low-Latency} Model Update C Sima, Y Fu, MK Sit, L Guo, X Gong, F Lin, J Wu, Y Li, H Rong, PL Aublin, ... 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2022 | 37 | 2022 |
An experimental framework for improving the performance of bft consensus for future permissioned blockchains MK Sit, M Bravo, Z István Proceedings of the 15th ACM International Conference on Distributed and …, 2021 | 16 | 2021 |
FPGA-based accelerator for losslessly quantized convolutional neural networks MK Sit, R Kazami, H Amano 2017 International Conference on Field Programmable Technology (ICFPT), 295-298, 2017 | 14 | 2017 |
Baole Ai, Kai Zeng, Peter Pietzuch, and Luo Mai. 2023. Quiver: Supporting GPUs for Low-Latency, High-Throughput GNN Serving with Workload Awareness Z Tan, X Yuan, C He, MK Sit, G Li, X Liu arXiv preprint arXiv:2305.10863, 2023 | 11 | 2023 |
Quiver: Supporting GPUs for Low-Latency, High-Throughput GNN Serving with Workload Awareness Z Tan, X Yuan, C He, MK Sit, G Li, X Liu, B Ai, K Zeng, P Pietzuch, L Mai arXiv preprint arXiv:2305.10863, 2023 | 5 | 2023 |
Towards Improving the Performance of BFT Consensus For Future Permissioned Blockchains M Bravo, Z István, MK Sit arXiv preprint arXiv:2007.12637, 2020 | 5 | 2020 |
MoE-CAP: Cost-Accuracy-Performance Benchmarking for Mixture-of-Experts Systems Y Fu, Y Jiang, Y Huang, P Nie, Z Lu, L Xue, C He, MK Sit, J Xue, L Dong, ... arXiv preprint arXiv:2412.07067, 2024 | 1 | 2024 |
GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models H Wang, MK Sit, C He, Y Wen, W Zhang, J Wang, Y Yang, L Mai | 1 | 2023 |