{FIRM}: An intelligent fine-grained resource management framework for {SLO-Oriented} microservices

H Qiu, SS Banerjee, S Jha, ZT Kalbarczyk… - 14th USENIX symposium …, 2020 - usenix.org
User-facing latency-sensitive web services include numerous distributed,
intercommunicating microservices that promise to simplify software development and …

An open-source benchmark suite for microservices and their hardware-software implications for cloud & edge systems

Y Gan, Y Zhang, D Cheng, A Shetty, P Rathi… - Proceedings of the …, 2019 - dl.acm.org
Cloud services have recently started undergoing a major shift from monolithic applications,
to graphs of hundreds or thousands of loosely-coupled microservices. Microservices …

Parties: Qos-aware resource partitioning for multiple interactive services

S Chen, C Delimitrou, JF Martínez - Proceedings of the Twenty-Fourth …, 2019 - dl.acm.org
Multi-tenancy in modern datacenters is currently limited to a single latency-critical,
interactive service, running alongside one or more low-priority, best-effort jobs. This limits …

The architectural implications of cloud microservices

Y Gan, C Delimitrou - IEEE Computer Architecture Letters, 2018 - ieeexplore.ieee.org
Cloud services have recently undergone a shift from monolithic applications to
microservices, with hundreds or thousands of loosely-coupled microservices comprising the …

Amdahl's law for tail latency

C Delimitrou, C Kozyrakis - Communications of the ACM, 2018 - dl.acm.org
Amdahl's law for tail latency Page 1 AUGUST 2018 | VOL. 61 | NO. 8 | COMMUNICATIONS
OF THE ACM 65 TRANSLATING THE IMPACT of Amdahl’s Law on tail latency provides new …

Mage: Online and interference-aware scheduling for multi-scale heterogeneous systems

F Romero, C Delimitrou - … of the 27th international conference on …, 2018 - dl.acm.org
Heterogeneity has grown in popularity both at the core and server level as a way to improve
both performance and energy efficiency. However, despite these benefits, scheduling …

Enhancing server efficiency in the face of killer microseconds

A Mirhosseini, A Sriraman… - 2019 IEEE International …, 2019 - ieeexplore.ieee.org
We are entering an era of “killer microseconds” in data center applications. Killer
microseconds refer to μs-scale “holes” in CPU schedules caused by stalls to access fast I/O …

IADA: A dynamic interference-aware cloud scheduling architecture for latency-sensitive workloads

V Meyer, ML da Silva, DF Kirchoff… - Journal of Systems and …, 2022 - Elsevier
Cloud computing allows several applications to share physical resources, yielding rapid
provisioning and improving hardware utilization. However, multiple applications contending …

Neurometer: An integrated power, area, and timing modeling framework for machine learning accelerators industry track paper

T Tang, S Li, L Nai, N Jouppi… - 2021 IEEE International …, 2021 - ieeexplore.ieee.org
As Machine Learning (ML) becomes pervasive in the era of artificial intelligence, ML specific
tools and frameworks are required for architectural research. This paper introduces …

Q-zilla: A scheduling framework and core microarchitecture for tail-tolerant microservices

A Mirhosseini, BL West, GW Blake… - … Symposium on High …, 2020 - ieeexplore.ieee.org
Managing tail latency is a primary challenge in designing large-scale Internet services.
Queuing is a major contributor to end-to-end tail latency, wherein nominal tasks are …