Pythia: A customizable hardware prefetching framework using online reinforcement learning

R Bera, K Kanellopoulos, A Nori, T Shahroodi… - MICRO-54: 54th Annual …, 2021 - dl.acm.org
Past research has proposed numerous hardware prefetching techniques, most of which rely
on exploiting one specific type of program context information (eg, program counter …

The championship simulator: Architectural simulation for education and competition

N Gober, G Chacon, L Wang, PV Gratz… - arxiv preprint arxiv …, 2022 - arxiv.org
Recent years have seen a dramatic increase in the microarchitectural complexity of
processors. This increase in complexity presents a twofold challenge for the field of …

Decoupled vector runahead

A Naithani, J Roelandts, S Ainsworth… - Proceedings of the 56th …, 2023 - dl.acm.org
We present Decoupled Vector Runahead (DVR), an in-core prefetching technique,
executing separately to the main application thread, that exploits massive amounts of …

AfterImage: Leaking control flow data and tracking load operations via the hardware prefetcher

Y Chen, L Pei, TE Carlson - Proceedings of the 28th ACM International …, 2023 - dl.acm.org
Research into processor-based side-channels has seen both a large number and a large
variety of disclosed vulnerabilities that can leak critical, private data to malicious attackers …

Hermes: Accelerating long-latency load requests via perceptron-based off-chip load prediction

R Bera, K Kanellopoulos… - 2022 55th IEEE/ACM …, 2022 - ieeexplore.ieee.org
Long-latency load requests continue to limit the performance of modern high-performance
processors. To increase the latency tolerance of a processor, architects have primarily relied …

Clip: Load criticality based data prefetching for bandwidth-constrained many-core systems

B Panda - Proceedings of the 56th Annual IEEE/ACM …, 2023 - dl.acm.org
Hardware prefetching is a latency-hiding technique that hides the costly off-chip DRAM
accesses. However, state-of-the-art prefetchers fail to deliver performance improvement in …

Effective mimicry of belady's min policy

I Shah, A Jain, C Lin - 2022 IEEE International Symposium on …, 2022 - ieeexplore.ieee.org
The past decade has seen the rise of highly successful cache replacement policies that are
based on binary prediction. For example, the Hawkeye policy learns whether lines loaded …

Micro-armed bandit: lightweight & reusable reinforcement learning for microarchitecture decision-making

G Gerogiannis, J Torrellas - Proceedings of the 56th Annual IEEE/ACM …, 2023 - dl.acm.org
Online Reinforcement Learning (RL) has been adopted as an effective mechanism in
various decision-making problems in microarchitecture. Its high adaptability and the ability to …

Berti: an accurate local-delta data prefetcher

A Navarro-Torres, B Panda… - 2022 55th IEEE/ACM …, 2022 - ieeexplore.ieee.org
Data prefetching is a technique that plays a crucial role in modern high-performance
processors by hiding long latency memory accesses. Several state-of-the-art hardware …

Snake: A variable-length chain-based prefetching for gpus

S Mostofi, H Falahati, N Mahani… - Proceedings of the 56th …, 2023 - dl.acm.org
Graphics Processing Units (GPUs) utilize memory hierarchy and Thread-Level Parallelism
(TLP) to tolerate off-chip memory latency, which is a significant bottleneck for memory-bound …