Seguir
Chaojie Zhang
Chaojie Zhang
Dirección de correo verificada de microsoft.com - Página principal
Título
Citado por
Citado por
Año
Splitwise: Efficient generative llm inference using phase splitting
P Patel, E Choukse, C Zhang, A Shah, Í Goiri, S Maleki, R Bianchini
2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture …, 2024
992024
Real-time serverless: Enabling application performance guarantees
HD Nguyen, C Zhang, Z Xiao, AA Chien
Proceedings of the 5th International Workshop on Serverless Computing, 1-6, 2019
492019
Flex: High-availability datacenters with zero reserved power
C Zhang, AG Kumbhare, I Manousakis, D Zhang, PA Misra, R Assis, ...
2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture …, 2021
422021
Towards Greener LLMs: Bringing Energy-Efficiency to the Forefront of LLM Inference
J Stojkovic, E Choukse, C Zhang, I Goiri, J Torrellas
arXiv preprint arXiv:2403.20306, 2024
412024
Characterizing Power Management Opportunities for LLMs in the Cloud
P Patel, E Choukse, C Zhang, Í Goiri, B Warrier, N Mahalingam, ...
Proceedings of the 29th ACM International Conference on Architectural …, 2024
302024
Characterizing curtailed and uneconomic renewable power in the mid-continent independent system operator
AA Chien, F Yang, C Zhang
arXiv preprint arXiv:1702.05403, 2016
282016
Splitwise: Efficient generative llm inference using phase splitting. In 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)
P Patel, E Choukse, C Zhang, A Shah, Í Goiri, S Maleki, R Bianchini
IEEE Computer Society, Los Alamitos, CA, USA, 118-132, 2024
252024
Dynamollm: Designing llm inference clusters for performance and energy efficiency
J Stojkovic, C Zhang, Í Goiri, J Torrellas, E Choukse
arXiv preprint arXiv:2408.00741, 2024
182024
Designing cloud servers for lower carbon
J Wang, DS Berger, F Kazhamiaka, C Irvene, C Zhang, E Choukse, ...
2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture …, 2024
162024
Myths and misconceptions around reducing carbon embedded in cloud platforms
J Lyu, J Wang, K Frost, C Zhang, C Irvene, E Choukse, R Fonseca, ...
Proceedings of the 2nd Workshop on Sustainable Computer Systems, 1-7, 2023
152023
Beyond PUE: Flexible datacenters empowering the cloud to decarbonize
AA Chien, C Zhang, L Lin
USENIX Hot Carbon, 2022
132022
Scheduling challenges for variable capacity resources
C Zhang, AA Chien
Job Scheduling Strategies for Parallel Processing: 24th International …, 2021
132021
Information models: Creating and preserving value in volatile cloud resources
C Zhang, V Gupta, AA Chien
2019 IEEE International Conference on Cloud Engineering (IC2E), 45-55, 2019
112019
Polca: Power oversubscription in llm cloud providers
P Patel, E Choukse, C Zhang, Í Goiri, B Warrier, N Mahalingam, ...
arXiv preprint arXiv:2308.12908, 2023
102023
Flex: High-availability datacenters with zero reserved power. In 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA)
C Zhang, AG Kumbhare, I Manousakis, D Zhang, PA Misra, R Assis, ...
IEEE, 319ś332, 2021
102021
TAPAS: Thermal-and Power-Aware Scheduling for LLM Inference in Cloud Platforms
J Stojkovic, C Zhang, Í Goiri, E Choukse, H Qiu, R Fonseca, J Torrellas, ...
arXiv preprint arXiv:2501.02600, 2025
32025
Mnemosyne: Parallelization strategies for efficiently serving multi-million context length llm inference requests without approximations
A Agrawal, J Chen, Í Goiri, R Ramjee, C Zhang, A Tumanov, E Choukse
arXiv preprint arXiv:2409.17264, 2024
32024
Eliminating the Capacity Variation Penalty for Cloud Resource Management
C Zhang
The University of Chicago, 2023
32023
Zero-carbon cloud: research challenges for datacenters as supply-following loads
AA Chien, C Zhang, HD Nguyen
University of Chicago, Tech. Rep. CS-TR-2019-08, 2019
32019
Performance Analysis of MapReduce Implementations for High Performance Homology Search (Unrefereed Workshop Manuscript)
C Zhang, K Shirahata, S Suzuki, Y Akiyama, S Matsuoka
情報処理学会研究報告.[ハイパフォーマンスコンピューティング] 2014 (29), 1-7, 2014
32014
El sistema no puede realizar la operación en estos momentos. Inténtalo de nuevo más tarde.
Artículos 1–20