Shijie Cao

Geciteerd door

	Alles	Sinds 2020
Citaties	611	586
h-index	9	9
i10-index	9	9

180

135

201920202021202220232024202523 65 105 109 116 167 24

Openbare toegang

Alles bekijken

3 artikelen

0 artikelen

beschikbaar

niet beschikbaar

Op basis van financieringsmachtigingen

Medeauteurs

Lingxiao MaSenior Researcher, Microsoft ResearchGeverifieerd e-mailadres voor pku.edu.cn
Chen Zhang (张宸)Shanghai Jiao Tong UniversityGeverifieerd e-mailadres voor sjtu.edu.cn
Wencong XiaoAlibaba GroupGeverifieerd e-mailadres voor alibaba-inc.com
Zhuliang YaoTsinghua UniversityGeverifieerd e-mailadres voor mails.tsinghua.edu.cn
Lintao ZhangMicrosoft Research AsiaGeverifieerd e-mailadres voor microsoft.com
Fan YangMicrosoft ResearchGeverifieerd e-mailadres voor microsoft.com
Ranggi HwangKAISTGeverifieerd e-mailadres voor kaist.ac.kr
Derek ChiouProfessor, ECE, UT Austin and Partner Architect, Microsoft AzureGeverifieerd e-mailadres voor ece.utexas.edu
Xu NingyiMicrosoft Research

Volgen

Shijie Cao

Microsoft Research Asia

Geverifieerd e-mailadres voor microsoft.com - Homepage

Efficient Deep Learning Deep Learning System Computer Architecture


Titel Sorteren op citaties Sorteren op jaar Sorteren op titel	Geciteerd door Geciteerd door	Jaar
Efficient and effective sparse LSTM on FPGA with bank-balanced sparsity S Cao, C Zhang, Z Yao, W Xiao, L Nie, D Zhan, Y Liu, M Wu, L Zhang Proceedings of the 2019 ACM/SIGDA International Symposium on Field …, 2019	213	2019
Balanced sparsity for efficient dnn inference on gpu Z Yao, S Cao, W Xiao, C Zhang, L Nie Proceedings of the AAAI conference on artificial intelligence 33 (01), 5676-5683, 2019	131	2019
Seernet: Predicting convolutional neural network feature-map sparsity through low-bit quantization S Cao, L Ma, W Xiao, C Zhang, Y Liu, L Zhang, L Nie, Z Yang Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2019	96	2019
Evomoe: An evolutional mixture-of-experts training framework via dense-to-sparse gate X Nie, X Miao, S Cao, L Ma, Q Liu, J Xue, Y Miao, Y Liu, Z Yang, B Cui arXiv preprint arXiv:2112.14397, 2021	31	2021
Dense-to-sparse gate for mixture-of-experts X Nie, S Cao, X Miao, L Ma, J Xue, Y Miao, Z Yang, Z Yang, CUI Bin	27	2021
Integer or floating point? new outlooks for low-bit quantization on large language models Y Zhang, L Zhao, S Cao, S Zhang, W Wang, T Cao, F Yang, M Yang, ... 2024 IEEE International Conference on Multimedia and Expo (ICME), 1-6, 2024	25	2024
Pre-gated moe: An algorithm-system co-design for fast and scalable mixture-of-expert inference R Hwang, J Wei, S Cao, C Hwang, X Tang, T Cao, M Yang 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture …, 2024	25	2024
Bitdistiller: Unleashing the potential of sub-4-bit llms via self-distillation D Du, Y Zhang, S Cao, J Guo, T Cao, X Chu, N Xu arXiv preprint arXiv:2402.10631, 2024	12	2024
Nn-stretch: Automatic neural network branching for parallel inference on heterogeneous multi-processors J Wei, T Cao, S Cao, S Jiang, S Fu, M Yang, Y Zhang, Y Liu Proceedings of the 21st Annual International Conference on Mobile Systems …, 2023	10	2023
Accurate and structured pruning for efficient automatic speech recognition H Jiang, LL Zhang, Y Li, Y Wu, S Cao, T Cao, Y Yang, J Li, M Yang, L Qiu arXiv preprint arXiv:2305.19549, 2023	9	2023
Ladder: Enabling Efficient {Low-Precision} Deep Learning Computing through Hardware-aware Tensor Transformation L Wang, L Ma, S Cao, Q Zhang, J Xue, Y Shi, N Zheng, Z Miao, F Yang, ... 18th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2024	8	2024
Efficient gpu kernels for n: m-sparse weights in deep learning B Lin, N Zheng, L Wang, S Cao, L Ma, Q Zhang, Y Zhu, T Cao, J Xue, ... Proceedings of Machine Learning and Systems 5, 513-525, 2023	8	2023
T-mac: Cpu renaissance via table lookup for low-bit llm deployment on edge J Wei, S Cao, T Cao, L Ma, L Wang, Y Zhang, M Yang arXiv preprint arXiv:2407.00088, 2024	5	2024
Afpq: Asymmetric floating point quantization for llms Y Zhang, S Zhang, S Cao, D Du, J Wei, T Cao, N Xu arXiv preprint arXiv:2311.01792, 2023	4	2023
Adam Accumulation to Reduce Memory Footprints of both Activations and Gradients for Large-scale DNN Training Y Zhang, Y Han, S Cao, G Dai, Y Miao, T Cao, F Yang, N Xu arXiv preprint arXiv:2305.19982, 2023	3	2023
Seerattention: Learning intrinsic sparse attention in your llms Y Gao, Z Zeng, D Du, S Cao, HKH So, T Cao, F Yang, M Yang arXiv preprint arXiv:2410.13276, 2024	2	2024
Lut tensor core: Lookup table enables efficient low-bit llm inference acceleration Z Mo, L Wang, J Wei, Z Zeng, S Cao, L Ma, N Jing, T Cao, J Xue, F Yang, ... arXiv preprint arXiv:2408.06003, 2024	2	2024
Dissecting Bit-Level Scaling Laws in Quantizing Vision Generative Models X Ding, S Cao, T Cao, Z Chen arXiv preprint arXiv:2501.06218, 2025		2025
Fine-Grained Structured Sparse Computing for FPGA-Based AI Inference C Zhang, S Cao, G Dai, C Geng, Z Yao, W Xiao, Y Liu, M Wu, L Zhang, ... IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2024		2024
Automating Energy-Efficient GPU Kernel Generation: A Fast Search-Based Compilation Approach Y Zhang, Z Gou, S Cao, W Feng, S Zhang, G Dai, N Xu arXiv preprint arXiv:2411.18873, 2024		2024

Het systeem kan de bewerking nu niet uitvoeren. Probeer het later opnieuw.

Artikelen 1–20

Citaties per jaar

Dubbele citaties

Samengevoegde citaties

Medeauteurs toevoegenMedeauteurs

Volgen

Geciteerd door

Medeauteurs