A survey on cache management mechanisms for real-time embedded systems
Multicore processors are being extensively used by real-time systems, mainly because of
their demand for increased computing power. However, multicore processors have shared …
their demand for increased computing power. However, multicore processors have shared …
Heracles: Improving resource efficiency at scale
User-facing, latency-sensitive services, such as websearch, underutilize their computing
resources during daily periods of low traffic. Reusing those resources for other tasks is rarely …
resources during daily periods of low traffic. Reusing those resources for other tasks is rarely …
Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations
As much of the world's computing continues to move into the cloud, the overprovisioning of
computing resources to ensure the performance isolation of latency-sensitive tasks, such as …
computing resources to ensure the performance isolation of latency-sensitive tasks, such as …
Bubble-flux: Precise online qos management for increased utilization in warehouse scale computers
Ensuring the quality of service (QoS) for latency-sensitive applications while allowing co-
locations of multiple applications on servers is critical for improving server utilization and …
locations of multiple applications on servers is critical for improving server utilization and …
PIPP: Promotion/insertion pseudo-partitioning of multi-core shared caches
Y **e, GH Loh - ACM SIGARCH Computer Architecture News, 2009 - dl.acm.org
Many multi-core processors employ a large last-level cache (LLC) shared among the
multiple cores. Past research has demonstrated that sharing-oblivious cache management …
multiple cores. Past research has demonstrated that sharing-oblivious cache management …
Cache QoS: From concept to reality in the Intel® Xeon® processor E5-2600 v3 product family
A Herdrich, E Verplanke, P Autee… - … Symposium on High …, 2016 - ieeexplore.ieee.org
Over the last decade, addressing quality of service (QoS) in multi-core server platforms has
been growing research topic. QoS techniques have been proposed to address the shared …
been growing research topic. QoS techniques have been proposed to address the shared …
Ubik: Efficient cache sharing with strict QoS for latency-critical workloads
Chip-multiprocessors (CMPs) must often execute workload mixes with different performance
requirements. On one hand, user-facing, latency-critical applications (eg, web search) need …
requirements. On one hand, user-facing, latency-critical applications (eg, web search) need …
MoCA: Memory-centric, adaptive execution for multi-tenant deep neural networks
Driven by the wide adoption of deep neural networks (DNNs) across different application
domains, multi-tenancy execution, where multiple DNNs are deployed simultaneously on …
domains, multi-tenancy execution, where multiple DNNs are deployed simultaneously on …
The impact of memory subsystem resource sharing on datacenter applications
In this paper we study the impact of sharing memory resources on five Google datacenter
applications: a web search engine, bigtable, content analyzer, image stitching, and protocol …
applications: a web search engine, bigtable, content analyzer, image stitching, and protocol …
A hardware evaluation of cache partitioning to improve utilization and energy-efficiency while preserving responsiveness
Computing workloads often contain a mix of interactive, latency-sensitive foreground
applications and recurring background computations. To guarantee responsiveness …
applications and recurring background computations. To guarantee responsiveness …