- Academic Search

R Komuravelli, MD Sinclair, J Alsop, M Huzaifa… - ACM SIGARCH …, 2015 - dl.acm.org

Heterogeneous systems employ specialization for energy efficiency. Since data movement
is expected to be a dominant consumer of energy, these systems employ specialized …

保存引用被引用数: 104 関連記事全 11 バージョン

[Free GPT-4]

[PDF] illinois.edu

Efficient GPU synchronization without scopes: Saying no to complex consistency models

MD Sinclair, J Alsop, SV Adve - … of the 48th International Symposium on …, 2015 - dl.acm.org

As GPUs have become increasingly general purpose, applications with more general
sharing patterns and fine-grained synchronization have started to emerge. Unfortunately …

保存引用被引用数: 94 関連記事全 12 バージョン

[Free GPT-4]

[PDF] johnalsop.net

Spandex: A flexible interface for efficient heterogeneous coherence

J Alsop, M Sinclair, S Adve - 2018 ACM/IEEE 45th Annual …, 2018 - ieeexplore.ieee.org

Recent heterogeneous architectures have trended toward tighter integration and shared
memory largely due to the efficient communication and programmability enabled by this …

保存引用被引用数: 63 関連記事全 7 バージョン

[Free GPT-4]

[PDF] utexas.edu

Selective GPU caches to eliminate CPU-GPU HW cache coherence

N Agarwal, D Nellans, E Ebrahimi… - … Symposium on High …, 2016 - ieeexplore.ieee.org

Cache coherence is ubiquitous in shared memory multiprocessors because it provides a
simple, high performance memory abstraction to programmers. Recent work suggests …

保存引用被引用数: 73 関連記事全 5 バージョン

[Free GPT-4]

[PDF] acm.org

Chasing away RAts: Semantics and evaluation for relaxed atomics on heterogeneous systems

MD Sinclair, J Alsop, SV Adve - Proceedings of the 44th Annual …, 2017 - dl.acm.org

An unambiguous and easy-to-understand memory consistency model is crucial for ensuring
correct synchronization and guiding future design of heterogeneous systems. In a widely …

保存引用被引用数: 60 関連記事全 19 バージョン

[Free GPT-4]

[PDF] uiuc.edu

Lazy release consistency for GPUs

J Alsop, MS Orr, BM Beckmann… - 2016 49th Annual IEEE …, 2016 - ieeexplore.ieee.org

The heterogeneous-race-free (HRF) memory model has been embraced by the
Heterogeneous System Architecture (HSA) Foundation and OpenCL TM because it clearly …

保存引用被引用数: 61 関連記事全 8 バージョン

[Free GPT-4]

[PDF] acm.org

Coherence domain restriction on large scale systems

Y Fu, TM Nguyen, D Wentzlaff - … of the 48th International Symposium on …, 2015 - dl.acm.org

Designing massive scale cache coherence systems has been an elusive goal. Whether it be
on large-scale GPUs, future thousand-core chips, or across million-core warehouse scale …

保存引用被引用数: 61 関連記事全 9 バージョン

[Free GPT-4]

[PDF] acm.org

Mozart: Taming taxes and composing accelerators with shared-memory

V Suresh, B Mishra, Y **g, Z Zhu, N **… - Proceedings of the …, 2024 - dl.acm.org

Resource-constrained system-on-chips (SoCs) are increasingly heterogeneous with
specialized accelerators for various tasks. Acceleration taxes due to control and data …

保存引用被引用数: 1 関連記事全 6 バージョン

[Free GPT-4]

[PDF] um.es

Racer: TSO consistency via race detection

A Ros, S Kaxiras - 2016 49th Annual IEEE/ACM International …, 2016 - ieeexplore.ieee.org

Several recent efforts aim to simplify coherence and its associate costs (eg, directory size,
complexity) in multicores. The bulk of these efforts rely on program data-race-free (DRF) …

保存引用被引用数: 24 関連記事全 7 バージョン

[Free GPT-4]

[PDF] academia.edu

Callback: Efficient synchronization without invalidation with a directory just for spin-waiting

A Ros, S Kaxiras - Proceedings of the 42Nd Annual International …, 2015 - dl.acm.org

Cache coherence protocols based on self-invalidation allow a simpler design compared to
traditional invalidation-based protocols, by relying on data-race-free (DRF) semantics and …

保存引用被引用数: 34 関連記事全 17 バージョン

アラートを作成

引用

検索オプション

マイライブラリに保存しました

DeNovoSync: Efficient support for arbitrary synchronization without writer-initiated invalidations

Stash: Have your scratchpad and cache it too

Efficient GPU synchronization without scopes: Saying no to complex consistency models

Spandex: A flexible interface for efficient heterogeneous coherence

Selective GPU caches to eliminate CPU-GPU HW cache coherence

Chasing away RAts: Semantics and evaluation for relaxed atomics on heterogeneous systems

Lazy release consistency for GPUs

Coherence domain restriction on large scale systems

Mozart: Taming taxes and composing accelerators with shared-memory

Racer: TSO consistency via race detection

Callback: Efficient synchronization without invalidation with a directory just for spin-waiting