DeNovo: Rethinking the memory hierarchy for disciplined parallelism

B Choi, R Komuravelli, H Sung… - 2011 International …, 2011 - ieeexplore.ieee.org
For parallelism to become tractable for mass programmers, shared-memory languages and
environments must evolve to enforce disciplined practices that ban" wild shared-memory …

Complexity-effective multicore coherence

A Ros, S Kaxiras - Proceedings of the 21st international conference on …, 2012 - dl.acm.org
Much of the complexity and overhead (directory, state bits, invalidations) of a typical
directory coherence implementation stems from the effort to make it" invisible" even to the …

System and method for simplifying cache coherence using multiple write policies

S Kaxiras, A Ros - US Patent 9,274,960, 2016 - Google Patents
BACKGROUND The present invention relates in general to the caching of data in
multiprocessor systems and, more particularly, to simplified cache coherence protocols for …

Selective GPU caches to eliminate CPU-GPU HW cache coherence

N Agarwal, D Nellans, E Ebrahimi… - … Symposium on High …, 2016 - ieeexplore.ieee.org
Cache coherence is ubiquitous in shared memory multiprocessors because it provides a
simple, high performance memory abstraction to programmers. Recent work suggests …

TSO-CC: Consistency directed cache coherence for TSO

M Elver, V Nagarajan - 2014 IEEE 20th International …, 2014 - ieeexplore.ieee.org
Traditional directory coherence protocols are designed for the strictest consistency model,
sequential consistency (SC). When they are used for chip multiprocessors (CMPs) that …

DeNovoND: Efficient hardware support for disciplined non-determinism

H Sung, R Komuravelli, SV Adve - ACM SIGPLAN Notices, 2013 - dl.acm.org
Recent work has shown that disciplined shared-memory programming models that provide
deterministic-by-default semantics can simplify both parallel software and hardware …

Hierarchical private/shared classification: The key to simple and efficient coherence for clustered cache hierarchies

A Ros, M Davari, S Kaxiras - 2015 IEEE 21st International …, 2015 - ieeexplore.ieee.org
Hierarchical clustered cache designs are becoming an appealing alternative for multicores.
Grou** cores and their caches in clusters reduces network congestion by localizing traffic …

Protozoa: Adaptive granularity cache coherence

H Zhao, A Shriraman, S Kumar… - ACM SIGARCH Computer …, 2013 - dl.acm.org
State-of-the-art multiprocessor cache hierarchies propagate the use of a fixed granularity in
the cache organization to the design of the coherence protocol. Unfortunately, the fixed …

Reconciling predictability and coherent caching

A Bansal, J Singh, Y Hao, JY Wen… - 2020 9th …, 2020 - ieeexplore.ieee.org
Real-time systems are required to respond to their physical environment within predictable
time. While multi-core platforms provide incredible computational power and throughput …

Efficiently supporting dynamic task parallelism on heterogeneous cache-coherent systems

M Wang, T Ta, L Cheng, C Batten - 2020 ACM/IEEE 47th …, 2020 - ieeexplore.ieee.org
Manycore processors, with tens to hundreds of tiny cores but no hardware-based cache
coherence, can offer tremendous peak throughput on highly parallel programs while being …