Ningxin Zheng

Cited by

	All	Since 2020
Citations	801	801
h-index	11	11
i10-index	11	11

500

250

125

375

2020202120222023202420256 12 72 158 492 59

Public access

View all

7 articles

4 articles

available

not available

Based on funding mandates

Co-authors

Yuqing YangMicrosoftVerified email at microsoft.com
Quan ChenProfessor, Shanghai Jiao Tong UniversityVerified email at sjtu.edu.cn
Minyi GuoIEEE Fellow, Chair Professor, Shanghai Jiao Tong UniversityVerified email at cs.sjtu.edu.cn
Ting Cao 曹婷Microsoft ResearchVerified email at microsoft.com
Lingxiao MaSenior Researcher, Microsoft ResearchVerified email at pku.edu.cn
Fan YangMicrosoft ResearchVerified email at microsoft.com
Lidong ZhouMicrosoft ResearchVerified email at microsoft.com
Zhenhua HANMicrosoft Research AsiaVerified email at microsoft.com
Lili QiuNAI Fellow, ACM Fellow, IEEE Fellow, Professor, Dept. of Computer Science, The University of TexasVerified email at cs.utexas.edu
Weihao CuiShanghai Jiao Tong UniversityVerified email at sjtu.edu.cn
Haibin LinBytedanceVerified email at bytedance.com
Ziheng JiangResearch Scientist, ByteDanceVerified email at bytedance.com
Wei ZhangPHD Student of CS, Shanghai Jiao Tong UniversityVerified email at sjtu.edu.cn

Ningxin Zheng

Bytedance AML

Verified email at bytedance.com


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Efficientvit: Memory efficient vision transformer with cascaded group attention X Liu, H Peng, N Zheng, Y Yang, H Hu, Y Yuan Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023	344	2023
Nn-meter: Towards accurate latency prediction of deep-learning model inference on diverse edge devices LL Zhang, S Han, J Wei, N Zheng, T Cao, Y Yang, Y Liu Proceedings of the 19th Annual International Conference on Mobile Systems …, 2021	134	2021
Enable simultaneous dnn services based on deterministic operator overlap and precise latency prediction W Cui, H Zhao, Q Chen, N Zheng, J Leng, J Zhao, Z Song, T Ma, Y Yang, ... Proceedings of the International Conference for High Performance Computing …, 2021	53	2021
{SparTA}:{Deep-Learning} Model Sparsity via {Tensor-with-Sparsity-Attribute} N Zheng, B Lin, Q Zhang, L Ma, Y Yang, F Yang, Y Wang, M Yang, L Zhou 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2022	45	2022
Online video super-resolution with convolutional kernel bypass grafts J Xiao, X Jiang, N Zheng, H Yang, Y Yang, Y Yang, D Li, KM Lam IEEE Transactions on Multimedia 25, 8972-8987, 2023	30	2023
Toward qos-awareness and improved utilization of spatial multitasking gpus W Zhang, Q Chen, N Zheng, W Cui, K Fu, M Guo IEEE Transactions on Computers 71 (4), 866-879, 2021	27	2021
Astraea: towards QoS-aware and resource-efficient multi-stage GPU services W Zhang, Q Chen, K Fu, N Zheng, Z Huang, J Leng, M Guo Proceedings of the 27th ACM International Conference on Architectural …, 2022	25	2022
Pit: Optimization of dynamic sparse deep learning models via permutation invariant transformation N Zheng, H Jiang, Q Zhang, Z Han, L Ma, Y Yang, F Yang, C Zhang, L Qiu, ... Proceedings of the 29th Symposium on Operating Systems Principles, 331-347, 2023	22	2023
Optimizing dynamic neural networks with brainstorm W Cui, Z Han, L Ouyang, Y Wang, N Zheng, L Ma, Y Yang, F Yang, J Xue, ... 17th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2023	18	2023
URSA: Precise capacity planning and fair scheduling based on low-level statistics for public clouds W Zhang, N Zheng, Q Chen, Y Yang, Z Song, T Ma, J Leng, M Guo Proceedings of the 49th International Conference on Parallel Processing, 1-11, 2020	16	2020
Full-cycle energy consumption benchmark for low-carbon computer vision B Li, X Jiang, D Bai, Y Zhang, N Zheng, X Dong, L Liu, Y Yang, D Li arXiv preprint arXiv:2108.13465, 2021	11	2021
Online video streaming super-resolution with adaptive look-up table fusion G Yin, X Jiang, S Jiang, Z Han, N Zheng, H Yang, D Bai, H Tan, S Sun, ... CoRR, 2023	9	2023
CHARM: Collaborative host and accelerator resource management for gpu datacenters W Zhang, K Fu, N Zheng, Q Chen, C Li, W Zheng, M Guo 2021 IEEE 39th International Conference on Computer Design (ICCD), 307-315, 2021	9	2021
Efficient gpu kernels for n: m-sparse weights in deep learning B Lin, N Zheng, L Wang, S Cao, L Ma, Q Zhang, Y Zhu, T Cao, J Xue, ... Proceedings of Machine Learning and Systems 5, 513-525, 2023	8	2023
QoS-aware irregular collaborative inference for improving throughput of DNN services K Fu, J Shi, Q Chen, N Zheng, W Zhang, D Zeng, M Guo SC22: International Conference for High Performance Computing, Networking …, 2022	8	2022
Online streaming video super-resolution with convolutional look-up table G Yin, Z Qu, X Jiang, S Jiang, Z Han, N Zheng, H Yang, X Liu, Y Yang, ... IEEE Transactions on Image Processing 33, 2305-2317, 2024	7	2024
Ladder: Enabling Efficient {Low-Precision} Deep Learning Computing through Hardware-aware Tensor Transformation L Wang, L Ma, S Cao, Q Zhang, J Xue, Y Shi, N Zheng, Z Miao, F Yang, ... 18th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2024	7	2024
Flux: Fast software-based communication overlap on gpus through kernel fusion LW Chang, W Bao, Q Hou, C Jiang, N Zheng, Y Zhong, X Zhang, Z Song, ... arXiv preprint arXiv:2406.06858, 2024	6	2024
Poster: Precise capacity planning for database public clouds N Zheng, Q Chen, Y Yang, J Li, W Zheng, M Guo 2019 28th International Conference on Parallel Architectures and Compilation …, 2019	5	2019
Spaceevo: Hardware-friendly search space design for efficient int8 inference X Wang, LL Zhang, J Xu, Q Zhang, Y Wang, Y Yang, N Zheng, T Cao, ... Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023	4	2023

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors