Gabriele Oliaro

عدد مرات الاقتباسات

	الكل	قبل 2020
اقتباسات	323	323
h-index	5	5
i10-index	4	4

280

140

210

20222023202420252 24 262 33

عدد المنشورات المتاحة للجميع

عرض المجموعة جميعها

مقالتان (2)

0 مقالة

المقالات البحثية المتاحة للجميع

المقالات البحثية غير المتاحة للجميع

تمّ اختيار المعلومات استنادًا إلى تفويضات التمويل

المؤلفون المشاركون

Zhihao JiaAssistant Professor of Computer Science, Carnegie Mellon Universityبريد إلكتروني تم التحقق منه على cmu.edu
Minlan YuHarvard Universityبريد إلكتروني تم التحقق منه على g.harvard.edu
Aurick QiaoSnowflake AI Researchبريد إلكتروني تم التحقق منه على snowflake.com

متابعة

Gabriele Oliaro

Carnegie Mellon University

بريد إلكتروني تم التحقق منه على cs.cmu.edu - الصفحة الرئيسية

Machine Learning Distributed Systems Parallel Computing Networking


عنوان ترتيب حسب الاقتباسات ترتيب حسب السنة الترتيب حسب العنوان	عدد مرات الاقتباسات عدد مرات الاقتباسات	السنة
Specinfer: Accelerating large language model serving with tree-based speculative inference and verification‏ X Miao, G Oliaro, Z Zhang, X Cheng, Z Wang, Z Zhang, RYY Wong, A Zhu, ...‏ Proceedings of the 29th ACM International Conference on Architectural …, 2024‏	208	2024
Towards efficient generative large language model serving: A survey from algorithms to systems‏ X Miao, G Oliaro, Z Zhang, X Cheng, H Jin, T Chen, Z Jia‏ ACM Computing Surveys (CSUR) 57 (7), 2023‏	73	2023
Direct Telemetry Access‏ J Langlet, R Ben Basat, G Oliaro, M Mitzenmacher, M Yu, G Antichi‏ SIGCOMM 2023, 2023‏	16	2023
Zero-CPU collection with direct telemetry access‏ J Langlet, R Ben-Basat, S Ramanathan, G Oliaro, M Mitzenmacher, M Yu, ...‏ HotNets 2021, 2021‏	14	2021
Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models‏ Z Zhang, D Zhao, X Miao, G Oliaro, Q Li, Y Jiang, Z Jia‏ 🏆 ACL 2024 (Outstanding paper award), 2024‏	7	2024
FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning‏ X Miao, G Oliaro, X Cheng, M Wu, C Unger, Z Jia‏ arXiv preprint arXiv:2402.18789, 2024‏	5	2024
AdaServe: SLO-Customized LLM Serving with Fine-Grained Speculative Decoding‏ Z Li, Z Chen, R Delacourt, G Oliaro, Z Wang, Q Chen, S Lin, A Yang, ...‏ arXiv preprint arXiv:2501.12162, 2025‏		2025
SuffixDecoding: A Model-Free Approach to Speeding Up Large Language Model Inference‏ G Oliaro, Z Jia, D Campos, A Qiao‏ arXiv preprint arXiv:2411.04975, 2024‏		2024
Optimal Kernel Orchestration for Tensor Programs with Korch‏ M Hu, A Venkatram, S Biswas, B Marimuthu, B Hou, G Oliaro, H Wang, ...‏ ASPLOS 2024, 2024‏		2024

يتعذر على النظام إجراء العملية في الوقت الحالي. عاود المحاولة لاحقًا.

مقالات 1–9

عدد الاقتباسات في العام

اقتباسات مكررة

الاقتباسات المدمجة

إضافة مؤلفين مشاركينالمؤلفون المشاركون

متابعة

عدد مرات الاقتباسات

المؤلفون المشاركون