The MIT Alewife machine: Architecture and performance
Alewife is a multiprocessor architecture that supports up to 512 processing nodes connected
over a scalable and cost-effective mesh network at a constant cost per node. The MIT …
over a scalable and cost-effective mesh network at a constant cost per node. The MIT …
Cooperative caching for chip multiprocessors
J Chang, GS Sohi - ACM SIGARCH Computer Architecture News, 2006 - dl.acm.org
This paper presents CMP Cooperative Caching, a unified framework to manage a CMP's
aggregate on-chip cache resources. Cooperative caching combines the strengths of private …
aggregate on-chip cache resources. Cooperative caching combines the strengths of private …
Moesi-prime: preventing coherence-induced hammering in commodity workloads
Prior work shows that Rowhammer attacks---which flip bits in DRAM via frequent activations
of the same row (s)---are viable. Adversaries typically mount these attacks via instruction …
of the same row (s)---are viable. Adversaries typically mount these attacks via instruction …
Token coherence: Decoupling performance and correctness
Many future shared-memory multiprocessor servers will both target commercial workloads
and use highly-integrated" glueless" designs. Implementing low-latency cache coherence in …
and use highly-integrated" glueless" designs. Implementing low-latency cache coherence in …
A communication characterisation of splash-2 and parsec
N Barrow-Williams, C Fensch… - 2009 IEEE international …, 2009 - ieeexplore.ieee.org
Recent benchmark suite releases such as Parsec specifically utilise the tightly coupled
cores available in chip-multiprocessors to allow the use of newer, high performance, models …
cores available in chip-multiprocessors to allow the use of newer, high performance, models …
Dynamic self-invalidation: Reducing coherence overhead in shared-memory multiprocessors
This paper introduces dynamic self-invalidation (DSI), a new technique for reducing cache
coherence overhead in shared-memory multiprocessors. DSI eliminates invalidation …
coherence overhead in shared-memory multiprocessors. DSI eliminates invalidation …
Performance of database workloads on shared-memory systems with out-of-order processors
Database applications such as online transaction processing (OLTP) and decision support
systems (DSS) constitute the largest and fastest-growing segment of the market for …
systems (DSS) constitute the largest and fastest-growing segment of the market for …
[BOOK][B] Memory consistency models for shared-memory multiprocessors
K Gharachorloo - 1996 - search.proquest.com
The memory consistency model for a shared-memory multiprocessor specifies the behavior
of memory with respect to read and write operations from multiple processors. As such, the …
of memory with respect to read and write operations from multiple processors. As such, the …
Adaptive cache coherency for detecting migratory shared data
AL Cox, RJ Fowler - ACM SIGARCH Computer Architecture News, 1993 - dl.acm.org
Parallel programs exhibit a small number of distinct data-sharing patterns. A common data-
sharing pattern, migratory access, is characterized by exclusive read and write access by …
sharing pattern, migratory access, is characterized by exclusive read and write access by …
CRL: High-performance all-software distributed shared memory
Abstract The C/? egion Library(CRL) is a new all-software distributed shared memory (DSM)
system. CRL requires no special compiler, hardware, or operating system support beyond …
system. CRL requires no special compiler, hardware, or operating system support beyond …