Dirigent: Enforcing QoS for latency-critical tasks on shared multicore systems
Latency-critical applications suffer from both average performance degradation and reduced
completion time predictability when collocated with batch tasks. Such variation forces the …
completion time predictability when collocated with batch tasks. Such variation forces the …
Mask: Redesigning the gpu memory hierarchy to support multi-application concurrency
Graphics Processing Units (GPUs) exploit large amounts of threadlevel parallelism to
provide high instruction throughput and to efficiently hide long-latency stalls. The resulting …
provide high instruction throughput and to efficiently hide long-latency stalls. The resulting …
DASH: Deadline-aware high-performance memory scheduler for heterogeneous systems with hardware accelerators
Modern SoCs integrate multiple CPU cores and hardware accelerators (HWAs) that share
the same main memory system, causing interference among memory requests from different …
the same main memory system, causing interference among memory requests from different …
Exploiting inter-warp heterogeneity to improve GPGPU performance
In a GPU, all threads within a warp execute the same instruction in lockstep. For a memory
instruction, this can lead to memory divergence: the memory requests for some threads are …
instruction, this can lead to memory divergence: the memory requests for some threads are …
Kelp: Qos for accelerated machine learning systems
Development and deployment of machine learning (ML) accelerators in Warehouse Scale
Computers (WSCs) demand significant capital investments and engineering efforts …
Computers (WSCs) demand significant capital investments and engineering efforts …
[HTML][HTML] Enhancing QoS in Multicore Systems with Heterogeneous Memory Configurations
J Kim, H Park, J Hong - Electronics, 2024 - mdpi.com
Quality of service (QoS) has evolved to ensure performance across various computing
environments, focusing on data bandwidth, response time, throughput, and stability …
environments, focusing on data bandwidth, response time, throughput, and stability …
Investigating fairness in disaggregated non-volatile memories
Many applications have growing demands for memory, particularly in the HPC space,
making the memory system a potential bottleneck of next-generation computing systems …
making the memory system a potential bottleneck of next-generation computing systems …
Providing high and controllable performance in multicore systems through shared resource management
L Subramanian - arxiv preprint arxiv:1508.03087, 2015 - arxiv.org
Multiple applications executing concurrently on a multicore system interfere with each other
at different shared resources such as main memory and shared caches. Such inter …
at different shared resources such as main memory and shared caches. Such inter …
A memory controller with row buffer locality awareness for hybrid memory systems
Non-volatile memory (NVM) is a class of promising scalable memory technologies that can
potentially offer higher capacity than DRAM at the same cost point. Unfortunately, the access …
potentially offer higher capacity than DRAM at the same cost point. Unfortunately, the access …
Exploiting the dram microarchitecture to increase memory-level parallelism
This paper summarizes the idea of Subarray-Level Parallelism (SALP) in DRAM, which was
published in ISCA 2012, and examines the work's significance and future potential. Modern …
published in ISCA 2012, and examines the work's significance and future potential. Modern …