دنبال کردن
Jie Zhao
Jie Zhao
Professor of Computer Science, Hunan University
ایمیل تأیید شده در hnu.edu.cn - صفحهٔ اصلی
عنوان
نقل شده توسط
نقل شده توسط
سال
AKG: automatic kernel generation for neural processing units using polyhedral transformations
J Zhao, B Li, W Nie, Z Geng, R Zhang, X Gao, B Cheng, C Wu, Y Cheng, ...
Proceedings of the 42nd ACM SIGPLAN International Conference on Programming …, 2021
802021
Apollo: Automatic partition-based operator fusion through layer by layer optimization
J Zhao, X Gao, R Xia, Z Zhang, D Chen, L Chen, R Zhang, Z Geng, ...
Proceedings of Machine Learning and Systems 4, 1-19, 2022
462022
Oneflow: Redesign the distributed deep learning framework from scratch
J Yuan, X Li, C Cheng, J Liu, R Guo, S Cai, C Yao, F Yang, X Yi, C Wu, ...
arXiv preprint arXiv:2110.15032, 2021
442021
Optimizing the memory hierarchy by compositing automatic transformations on computations and data
J Zhao, P Di
2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture …, 2020
322020
Flextended tiles: A flexible extension of overlapped tiles for polyhedral compilation
J Zhao, A Cohen
ACM Transactions on Architecture and Code Optimization (TACO) 16 (4), 1-25, 2019
242019
A polyhedral compilation framework for loops with dynamic data-dependent bounds
J Zhao, M Kruse, A Cohen
Proceedings of the 27th International Conference on Compiler Construction, 14-24, 2018
192018
WCCV: Improving the vectorization of IF-statements with warp-coherent conditions
H Sun, F Fey, J Zhao, S Gorlatch
Proceedings of the ACM International Conference on Supercomputing, 319-329, 2019
92019
Effectively scheduling computational graphs of deep neural networks toward their {Domain-Specific} accelerators
J Zhao, S Feng, X Dan, F Liu, C Wang, S Yuan, W Lv, Q Xie
17th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2023
82023
A holistic approach to automatic mixed-precision code generation and tuning for affine programs
J Xu, G Song, B Zhou, F Li, J Hao, J Zhao
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and …, 2024
72024
A nonlinear array subscripts dependence test
Z Jie, Z Rongcai, H Lin
2012 IEEE 14th International Conference on High Performance Computing and …, 2012
62012
An improved nonlinear data dependence test
J Zhao, R Zhao, X Chen, B Zhao
The Journal of Supercomputing 71 (1), 340-368, 2015
32015
QP test: a dependence test for quadratic array subscripts
J Zhao, R Zhao, L Han, J Xu
IET software 7 (5), 271-282, 2013
32013
A nested loop fusion algorithm based on cost analysis
Z Jie, Z Rongcai, Y Yuan
2012 IEEE 14th International Conference on High Performance Computing and …, 2012
32012
Modeling the Interplay between Loop Tiling and Fusion in Optimizing Compilers Using Affine Relations
J Zhao, J Xu, P Di, W Nie, J Hu, Y Yi, S Yang, Z Geng, R Zhang, B Li, ...
ACM Transactions on Computer Systems 41 (1-4), 1-45, 2024
22024
Eiffel: inferring input ranges of significant floating-point errors via polynomial extrapolation
Z Zhang, B Zhou, J Hao, H Yang, M Cui, Y Zhou, G Song, F Li, J Xu, ...
2023 38th IEEE/ACM International Conference on Automated Software …, 2023
22023
Parallelizing neural network models effectively on gpu by implementing reductions atomically
J Zhao, C Bastoul, Y Yi, J Hu, W Nie, R Zhang, Z Geng, C Li, T Tachon, ...
Proceedings of the International Conference on Parallel Architectures and …, 2022
22022
Automatically Generating High-performance Matrix Multiplication Kernels on the Latest Sunway Processor
X Tao, Y Zhu, B Wang, J Xu, J Pang, J Zhao
Proceedings of the 51st International Conference on Parallel Processing, 1-12, 2022
22022
A combined language and polyhedral approach to heterogeneous parallelism
J Zhao
Université Paris sciences et lettres, 2018
22018
Identifying superword level parallelism with directed graph reachability
J Zhao, RC Zhao
Scientia Sinica Informationis 47 (3), 310-325, 2017
22017
Enabling Tensor Language Model to Assist in Generating {High-Performance} Tensor Programs for Deep Learning
Y Zhai, S Yang, K Pan, R Zhang, S Liu, C Liu, Z Ye, J Ji, J Zhao, Y Zhang, ...
18th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2024
12024
سیستم در حال حاضر قادر به انجام عملکرد نیست. بعداً دوباره امتحان کنید.
مقاله‌ها 1–20