{FIRM}: An intelligent fine-grained resource management framework for {SLO-Oriented} microservices
User-facing latency-sensitive web services include numerous distributed,
intercommunicating microservices that promise to simplify software development and …
intercommunicating microservices that promise to simplify software development and …
An open-source benchmark suite for microservices and their hardware-software implications for cloud & edge systems
Cloud services have recently started undergoing a major shift from monolithic applications,
to graphs of hundreds or thousands of loosely-coupled microservices. Microservices …
to graphs of hundreds or thousands of loosely-coupled microservices. Microservices …
Parties: Qos-aware resource partitioning for multiple interactive services
Multi-tenancy in modern datacenters is currently limited to a single latency-critical,
interactive service, running alongside one or more low-priority, best-effort jobs. This limits …
interactive service, running alongside one or more low-priority, best-effort jobs. This limits …
The architectural implications of cloud microservices
Cloud services have recently undergone a shift from monolithic applications to
microservices, with hundreds or thousands of loosely-coupled microservices comprising the …
microservices, with hundreds or thousands of loosely-coupled microservices comprising the …
Amdahl's law for tail latency
Amdahl's law for tail latency Page 1 AUGUST 2018 | VOL. 61 | NO. 8 | COMMUNICATIONS
OF THE ACM 65 TRANSLATING THE IMPACT of Amdahl’s Law on tail latency provides new …
OF THE ACM 65 TRANSLATING THE IMPACT of Amdahl’s Law on tail latency provides new …
Mage: Online and interference-aware scheduling for multi-scale heterogeneous systems
Heterogeneity has grown in popularity both at the core and server level as a way to improve
both performance and energy efficiency. However, despite these benefits, scheduling …
both performance and energy efficiency. However, despite these benefits, scheduling …
Enhancing server efficiency in the face of killer microseconds
A Mirhosseini, A Sriraman… - 2019 IEEE International …, 2019 - ieeexplore.ieee.org
We are entering an era of “killer microseconds” in data center applications. Killer
microseconds refer to μs-scale “holes” in CPU schedules caused by stalls to access fast I/O …
microseconds refer to μs-scale “holes” in CPU schedules caused by stalls to access fast I/O …
IADA: A dynamic interference-aware cloud scheduling architecture for latency-sensitive workloads
V Meyer, ML da Silva, DF Kirchoff… - Journal of Systems and …, 2022 - Elsevier
Cloud computing allows several applications to share physical resources, yielding rapid
provisioning and improving hardware utilization. However, multiple applications contending …
provisioning and improving hardware utilization. However, multiple applications contending …
Neurometer: An integrated power, area, and timing modeling framework for machine learning accelerators industry track paper
As Machine Learning (ML) becomes pervasive in the era of artificial intelligence, ML specific
tools and frameworks are required for architectural research. This paper introduces …
tools and frameworks are required for architectural research. This paper introduces …
Q-zilla: A scheduling framework and core microarchitecture for tail-tolerant microservices
A Mirhosseini, BL West, GW Blake… - … Symposium on High …, 2020 - ieeexplore.ieee.org
Managing tail latency is a primary challenge in designing large-scale Internet services.
Queuing is a major contributor to end-to-end tail latency, wherein nominal tasks are …
Queuing is a major contributor to end-to-end tail latency, wherein nominal tasks are …