Follow
Ningxin Zheng
Ningxin Zheng
Bytedance AML
Verified email at bytedance.com
Title
Cited by
Cited by
Year
Efficientvit: Memory efficient vision transformer with cascaded group attention
X Liu, H Peng, N Zheng, Y Yang, H Hu, Y Yuan
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
3442023
Nn-meter: Towards accurate latency prediction of deep-learning model inference on diverse edge devices
LL Zhang, S Han, J Wei, N Zheng, T Cao, Y Yang, Y Liu
Proceedings of the 19th Annual International Conference on Mobile Systems …, 2021
1342021
Enable simultaneous dnn services based on deterministic operator overlap and precise latency prediction
W Cui, H Zhao, Q Chen, N Zheng, J Leng, J Zhao, Z Song, T Ma, Y Yang, ...
Proceedings of the International Conference for High Performance Computing …, 2021
532021
{SparTA}:{Deep-Learning} Model Sparsity via {Tensor-with-Sparsity-Attribute}
N Zheng, B Lin, Q Zhang, L Ma, Y Yang, F Yang, Y Wang, M Yang, L Zhou
16th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2022
452022
Online video super-resolution with convolutional kernel bypass grafts
J Xiao, X Jiang, N Zheng, H Yang, Y Yang, Y Yang, D Li, KM Lam
IEEE Transactions on Multimedia 25, 8972-8987, 2023
302023
Toward qos-awareness and improved utilization of spatial multitasking gpus
W Zhang, Q Chen, N Zheng, W Cui, K Fu, M Guo
IEEE Transactions on Computers 71 (4), 866-879, 2021
272021
Astraea: towards QoS-aware and resource-efficient multi-stage GPU services
W Zhang, Q Chen, K Fu, N Zheng, Z Huang, J Leng, M Guo
Proceedings of the 27th ACM International Conference on Architectural …, 2022
252022
Pit: Optimization of dynamic sparse deep learning models via permutation invariant transformation
N Zheng, H Jiang, Q Zhang, Z Han, L Ma, Y Yang, F Yang, C Zhang, L Qiu, ...
Proceedings of the 29th Symposium on Operating Systems Principles, 331-347, 2023
222023
Optimizing dynamic neural networks with brainstorm
W Cui, Z Han, L Ouyang, Y Wang, N Zheng, L Ma, Y Yang, F Yang, J Xue, ...
17th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2023
182023
URSA: Precise capacity planning and fair scheduling based on low-level statistics for public clouds
W Zhang, N Zheng, Q Chen, Y Yang, Z Song, T Ma, J Leng, M Guo
Proceedings of the 49th International Conference on Parallel Processing, 1-11, 2020
162020
Full-cycle energy consumption benchmark for low-carbon computer vision
B Li, X Jiang, D Bai, Y Zhang, N Zheng, X Dong, L Liu, Y Yang, D Li
arXiv preprint arXiv:2108.13465, 2021
112021
Online video streaming super-resolution with adaptive look-up table fusion
G Yin, X Jiang, S Jiang, Z Han, N Zheng, H Yang, D Bai, H Tan, S Sun, ...
CoRR, 2023
92023
CHARM: Collaborative host and accelerator resource management for gpu datacenters
W Zhang, K Fu, N Zheng, Q Chen, C Li, W Zheng, M Guo
2021 IEEE 39th International Conference on Computer Design (ICCD), 307-315, 2021
92021
Efficient gpu kernels for n: m-sparse weights in deep learning
B Lin, N Zheng, L Wang, S Cao, L Ma, Q Zhang, Y Zhu, T Cao, J Xue, ...
Proceedings of Machine Learning and Systems 5, 513-525, 2023
82023
QoS-aware irregular collaborative inference for improving throughput of DNN services
K Fu, J Shi, Q Chen, N Zheng, W Zhang, D Zeng, M Guo
SC22: International Conference for High Performance Computing, Networking …, 2022
82022
Online streaming video super-resolution with convolutional look-up table
G Yin, Z Qu, X Jiang, S Jiang, Z Han, N Zheng, H Yang, X Liu, Y Yang, ...
IEEE Transactions on Image Processing 33, 2305-2317, 2024
72024
Ladder: Enabling Efficient {Low-Precision} Deep Learning Computing through Hardware-aware Tensor Transformation
L Wang, L Ma, S Cao, Q Zhang, J Xue, Y Shi, N Zheng, Z Miao, F Yang, ...
18th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2024
72024
Flux: Fast software-based communication overlap on gpus through kernel fusion
LW Chang, W Bao, Q Hou, C Jiang, N Zheng, Y Zhong, X Zhang, Z Song, ...
arXiv preprint arXiv:2406.06858, 2024
62024
Poster: Precise capacity planning for database public clouds
N Zheng, Q Chen, Y Yang, J Li, W Zheng, M Guo
2019 28th International Conference on Parallel Architectures and Compilation …, 2019
52019
Spaceevo: Hardware-friendly search space design for efficient int8 inference
X Wang, LL Zhang, J Xu, Q Zhang, Y Wang, Y Yang, N Zheng, T Cao, ...
Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023
42023
The system can't perform the operation now. Try again later.
Articles 1–20