Παρακολούθηση
Haojun Xia
Haojun Xia
Η διεύθυνση ηλεκτρονικού ταχυδρομείου έχει επαληθευτεί στον τομέα uni.sydney.edu.au - Αρχική σελίδα
Τίτλος
Παρατίθεται από
Παρατίθεται από
Έτος
Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity
H Xia, Z Zheng, Y Li, D Zhuang, Z Zhou, X Qiu, Y Li, W Lin, SL Song
Proceedings of the VLDB Endowment (VLDB 2024) 17 (2), 211 - 224, 2023
522023
{Quant-LLM}: Accelerating the Serving of Large Language Models via {FP6-Centric}{Algorithm-System}{Co-Design} on Modern {GPUs}
H Xia, Z Zheng, X Wu, S Chen, Z Yao, S Youn, A Bakhtiari, M Wyatt, ...
2024 USENIX Annual Technical Conference (USENIX ATC 24), 699-713, 2024
20*2024
η-lstm: Co-designing highly-efficient large lstm training via exploiting memory-saving and architectural design opportunities
X Zhang, H Xia, D Zhuang, H Sun, X Fu, MB Taylor, SL Song
2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture …, 2021
172021
Zeroquant (4+ 2): Redefining llms quantization with a new fp6-centric strategy for diverse generative tasks
X Wu, H Xia, S Youn, Z Zheng, S Chen, A Bakhtiari, M Wyatt, ...
arXiv preprint arXiv:2312.08583, 2023
122023
Shift-BNN: Highly-efficient probabilistic Bayesian neural network training via memory-friendly pattern retrieving
Q Wan, H Xia, X Zhang, L Wang, SL Song, X Fu
MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture …, 2021
82021
Enabling fast and memory-efficient acceleration for pattern matching workloads: the lightweight automata processing engine
L Gong, C Wang, H Xia, X Chen, X Li, X Zhou
IEEE Transactions on Computers 72 (4), 1011-1025, 2022
42022
Lap: A lightweight automata processor for pattern matching tasks
H Xia, L Gong, C Wang, X Chen, X Zhou
2021 Design, Automation & Test in Europe Conference & Exhibition (DATE), 844-849, 2021
42021
{MonoNN}: Enabling a New Monolithic Optimization Space for Neural Network Inference Tasks on Modern {GPU-Centric} Architectures
D Zhuang, Z Zheng, H Xia, X Qiu, J Bai, W Lin, SL Song
18th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2024
32024
Δεν είναι δυνατή η εκτέλεση της ενέργειας από το σύστημα αυτή τη στιγμή. Προσπαθήστε ξανά αργότερα.
Άρθρα 1–8