Pythia: A customizable hardware prefetching framework using online reinforcement learning
Past research has proposed numerous hardware prefetching techniques, most of which rely
on exploiting one specific type of program context information (eg, program counter …
on exploiting one specific type of program context information (eg, program counter …
Effectively prefetching remote memory with leap
Memory disaggregation over RDMA can improve the performance of memory-constrained
applications by replacing disk swap** with remote memory accesses. However, state-of …
applications by replacing disk swap** with remote memory accesses. However, state-of …
Machine learning for computer systems and networking: A survey
Machine learning (ML) has become the de-facto approach for various scientific domains
such as computer vision and natural language processing. Despite recent breakthroughs …
such as computer vision and natural language processing. Despite recent breakthroughs …
Path confidence based lookahead prefetching
Designing prefetchers to maximize system performance often requires a delicate balance
between coverage and accuracy. Achieving both high coverage and accuracy is particularly …
between coverage and accuracy. Achieving both high coverage and accuracy is particularly …
A hierarchical neural model of data prefetching
This paper presents Voyager, a novel neural network for data prefetching. Unlike previous
neural models for prefetching, which are limited to learning delta correlations, our model can …
neural models for prefetching, which are limited to learning delta correlations, our model can …
Bingo spatial data prefetcher
Applications extensively use data objects with a regular and fixed layout, which leads to the
recurrence of access patterns over memory regions. Spatial data prefetching techniques …
recurrence of access patterns over memory regions. Spatial data prefetching techniques …
Prodigy: Improving the memory latency of data-indirect irregular workloads using hardware-software co-design
Irregular workloads are typically bottlenecked by the memory system. These workloads often
use sparse data representations, eg, compressed sparse row/column (CSR/CSC), to …
use sparse data representations, eg, compressed sparse row/column (CSR/CSC), to …
Evaluation of hardware data prefetchers on server processors
Data prefetching, ie, the act of predicting an application's future memory accesses and
fetching those that are not in the on-chip caches, is a well-known and widely used approach …
fetching those that are not in the on-chip caches, is a well-known and widely used approach …
Perceptron-based prefetch filtering
Hardware prefetching is an effective technique for hiding cache miss latencies in modern
processor designs. Prefetcher performance can be characterized by two main metrics that …
processor designs. Prefetcher performance can be characterized by two main metrics that …
Domino temporal data prefetcher
Big-data server applications frequently encounter data misses, and hence, lose significant
performance potential. One way to reduce the number of data misses or their effect is data …
performance potential. One way to reduce the number of data misses or their effect is data …