Shared memory consistency models: A tutorial
The memory consistency model of a system affects performance, programmability, and
portability. We aim to describe memory consistency models in a way that most computer …
portability. We aim to describe memory consistency models in a way that most computer …
Invisispec: Making speculative execution invisible in the cache hierarchy
Hardware speculation offers a major surface for micro-architectural covert and side channel
attacks. Unfortunately, defending against speculative execution attacks is challenging. The …
attacks. Unfortunately, defending against speculative execution attacks is challenging. The …
Speculative taint tracking (stt) a comprehensive protection for speculatively accessed data
Speculative execution attacks present an enormous security threat, capable of reading
arbitrary program data under malicious speculation, and later exfiltrating that data over …
arbitrary program data under malicious speculation, and later exfiltrating that data over …
Memory persistency
Emerging nonvolatile memory technologies (NVRAM) promise the performance of DRAM
with the persistence of disk. However, constraining NVRAM write order, necessary to ensure …
with the persistence of disk. However, constraining NVRAM write order, necessary to ensure …
[BUCH][B] Parallel computer architecture: a hardware/software approach
The most exciting development in parallel computer architecture is the convergence of
traditionally disparate approaches on a common machine structure. This book explains the …
traditionally disparate approaches on a common machine structure. This book explains the …
The Java memory model
This paper describes the new Java memory model, which has been revised as part of Java
5.0. The model specifies the legal behaviors for a multithreaded program; it defines the …
5.0. The model specifies the legal behaviors for a multithreaded program; it defines the …
Foundations of the C++ concurrency memory model
Currently multi-threaded C or C++ programs combine a single-threaded programming
language with a separate threads library. This is not entirely sound [7]. We describe an effort …
language with a separate threads library. This is not entirely sound [7]. We describe an effort …
Reactive NUCA: near-optimal block placement and replication in distributed caches
Increases in on-chip communication delay and the large working sets of server and scientific
workloads complicate the design of the on-chip last-level cache for multicore processors …
workloads complicate the design of the on-chip last-level cache for multicore processors …
Speculative lock elision: Enabling highly concurrent multithreaded execution
Serialization of threads due to critical sections is a fundamental bottleneck to achieving high
performance in multithreaded programs. Dynamically, such serialization may be …
performance in multithreaded programs. Dynamically, such serialization may be …
Delegated persist ordering
Systems featuring a load-store interface to persistent memory (PM) are expected soon,
making in-memory persistent data structures feasible. Ensuring persistent data structure …
making in-memory persistent data structures feasible. Ensuring persistent data structure …