DeNovo: Rethinking the memory hierarchy for disciplined parallelism
For parallelism to become tractable for mass programmers, shared-memory languages and
environments must evolve to enforce disciplined practices that ban" wild shared-memory …
environments must evolve to enforce disciplined practices that ban" wild shared-memory …
SynFull: Synthetic traffic models capturing cache coherent behaviour
Modern and future many-core systems represent complex architectures. The communication
fabrics of these large systems heavily influence their performance and power consumption …
fabrics of these large systems heavily influence their performance and power consumption …
Cuckoo directory: A scalable directory for many-core systems
Growing core counts have highlighted the need for scalable on-chip coherence
mechanisms. The increase in the number of on-chip cores exposes the energy and area …
mechanisms. The increase in the number of on-chip cores exposes the energy and area …
SCD: A scalable coherence directory with flexible sharer set encoding
Large-scale CMPs with hundreds of cores require a directory-based protocol to maintain
cache coherence. However, previously proposed coherence directories are hard to scale …
cache coherence. However, previously proposed coherence directories are hard to scale …
The bunker cache for spatio-value approximation
The cost of moving and storing data is still a fundamental concern for computer architects.
Inefficient handling of data can be attributed to conventional architectures being oblivious to …
Inefficient handling of data can be attributed to conventional architectures being oblivious to …
SPACE: Sharing pattern-based directory coherence for multicore scalability
An important challenge in multicore processors is the maintenance of cache coherence in a
scalable manner. Directory-based protocols save bandwidth and achieve scalability by …
scalable manner. Directory-based protocols save bandwidth and achieve scalability by …
Selective GPU caches to eliminate CPU-GPU HW cache coherence
Cache coherence is ubiquitous in shared memory multiprocessors because it provides a
simple, high performance memory abstraction to programmers. Recent work suggests …
simple, high performance memory abstraction to programmers. Recent work suggests …
Spatiotemporal coherence tracking
M Alisafaee - 2012 45th Annual IEEE/ACM International …, 2012 - ieeexplore.ieee.org
Chip-multiprocessors require a coherence directory to track data sharing and order
accesses to the shared data. Scaling coherence directories to support a large number of …
accesses to the shared data. Scaling coherence directories to support a large number of …
CCNoC: Specializing on-chip interconnects for energy efficiency in cache-coherent servers
Many core chips are emerging as the architecture of choice to provide power efficiency and
improve performance, while riding Moore's Law. In these architectures, on-chip inter …
improve performance, while riding Moore's Law. In these architectures, on-chip inter …
Coherence domain restriction on large scale systems
Designing massive scale cache coherence systems has been an elusive goal. Whether it be
on large-scale GPUs, future thousand-core chips, or across million-core warehouse scale …
on large-scale GPUs, future thousand-core chips, or across million-core warehouse scale …