TVM: An Automated End-to-End Optimizing Compiler for Deep Learning T Chen, T Moreau, Z Jiang, L Zheng, E Yan, H Shen, M Cowan, L Wang, ... 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2018 | 2350* | 2018 |
Improving Neural Network Quantization without Retraining using Outlier Channel Splitting R Zhao, Y Hu, J Dotzel, C De Sa, Z Zhang International Conference on Machine Learning, 7543-7552, 2019 | 382 | 2019 |
HeteroCL: A Multi-Paradigm Programming Infrastructure for Software-Defined Reconfigurable Computing YH Lai, Y Chi, Y Hu, J Wang, CH Yu, Y Zhou, J Cong, Z Zhang Proceedings of the 2019 ACM/SIGDA International Symposium on Field …, 2019 | 183* | 2019 |
Featgraph: A flexible and efficient backend for graph neural network systems Y Hu, Z Ye, M Wang, J Yu, D Zheng, M Li, Z Zhang, Z Zhang, Y Wang SC20: International Conference for High Performance Computing, Networking …, 2020 | 102 | 2020 |
GraphLily: Accelerating Graph Linear Algebra on HBM-Equipped FPGAs Y Hu, Y Du, E Ustun, Z Zhang 2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD), 1-9, 2021 | 78 | 2021 |
High-Performance Sparse Linear Algebra on HBM-Equipped FPGAs Using HLS: A Case Study on SpMV Y Du, Y Hu, Z Zhou, Z Zhang Proceedings of the 2022 ACM/SIGDA International Symposium on Field …, 2022 | 58 | 2022 |
BitFlow: Exploiting Vector Parallelism for Binary Neural Networks on CPU Y Hu, J Zhai, D Li, Y Gong, Y Zhu, W Liu, L Su, J Jin 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2018 | 51 | 2018 |
Building efficient deep neural networks with unitary group convolutions R Zhao, Y Hu, J Dotzel, CD Sa, Z Zhang Proceedings of the IEEE Conference on Computer Vision and Pattern …, 2019 | 34 | 2019 |
Unifying KV Cache Compression for Large Language Models with LeanKV Y Zhang, Y Hu, R Zhao, J Lui, H Chen arXiv preprint arXiv:2412.03131, 2024 | 2 | 2024 |