Sledovat
Minchen Yu
Minchen Yu
The Chinese University of Hong Kong, Shenzhen
E-mailová adresa ověřena na: cuhk.edu.cn - Domovská stránka
Název
Citace
Citace
Rok
{MArk}: Exploiting cloud services for {Cost-Effective},{SLO-Aware} machine learning inference serving
C Zhang, M Yu, W Wang, F Yan
2019 USENIX Annual Technical Conference (USENIX ATC 19), 1049-1062, 2019
3442019
Gillis: Serving large neural networks in serverless functions with automatic model partitioning
M Yu, Z Jiang, HC Ng, W Wang, R Chen, B Li
2021 IEEE 41st International Conference on Distributed Computing Systems …, 2021
742021
Following the data, not the function: Rethinking function orchestration in serverless computing
M Yu, T Cao, W Wang, R Chen
20th USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2023
67*2023
Continuum: A platform for cost-aware, low-latency continual learning
H Tian, M Yu, W Wang
Proceedings of the ACM Symposium on Cloud Computing, 26-40, 2018
412018
Enabling cost-effective, slo-aware machine learning inference serving on public cloud
C Zhang, M Yu, W Wang, F Yan
IEEE Transactions on Cloud Computing 10 (3), 1765-1779, 2020
352020
Faaswap: slo-aware, gpu-efficient serverless inference via model swapping
M Yu, A Wang, D Chen, H Yu, X Luo, Z Li, W Wang, R Chen, D Nie, ...
arXiv preprint arXiv:2306.03622, 2023
122023
Caraserve: Cpu-assisted and rank-aware lora serving for generative llm inference
S Li, H Lu, T Wu, M Yu, Q Weng, X Chen, Y Shan, B Yuan, W Wang
arXiv preprint arXiv:2401.11240, 2024
92024
{CrystalPerf}: Learning to Characterize the Performance of Dataflow Computation through Code Analysis
H Tian, M Yu, W Wang
2021 USENIX Annual Technical Conference (USENIX ATC 21), 253-267, 2021
62021
RepBun: Load-balanced, shuffle-free cluster caching for structured data
M Yu, Y Yu, Y Zheng, B Yang, W Wang
IEEE INFOCOM 2020-IEEE Conference on Computer Communications, 954-963, 2020
52020
{\lambda} Scale: Enabling Fast Scaling for Serverless Large Language Model Inference
M Yu, R Yang, C Jia, Z Su, S Yao, T Lan, Y Yang, Y Cheng, W Wang, ...
arXiv preprint arXiv:2502.09922, 2025
2025
Pheromone: Restructuring Serverless Computing With Data-Centric Function Orchestration
M Yu, T Cao, W Wang, R Chen
IEEE/ACM Transactions on Networking, 2024
2024
FaaSTube: Optimizing GPU-oriented Data Transfer for Serverless Computing
H Wu, J Deng, M Yu, Y Yu, Y Liu, H Fan, S Wu, W Wang
arXiv preprint arXiv:2411.01830, 2024
2024
Towards Usable, Efficient Serverless Computing Systems
M Yu
PQDT-Global, 2023
2023
Systém momentálně nemůže danou operaci provést. Zkuste to znovu později.
Články 1–13