- Academic Search

S Durvasula, A Zhao, F Chen, R Liang… - arxiv preprint arxiv …, 2023 - arxiv.org

Differentiable rendering is a technique used in an important emerging class of visual
computing applications that involves representing a 3D scene as a model that is trained from …

Opslaan Citeren Geciteerd door 13 Verwante artikelen Alle 3 versies HTML-versie

[Free GPT-4]
[DeepSeek]

[PDF] github.io

Hmg: Extending cache coherence protocols across modern hierarchical multi-gpu systems

X Ren, D Lustig, E Bolotin, A Jaleel… - … Symposium on High …, 2020 - ieeexplore.ieee.org

Prior work on GPU cache coherence has shown that simple hardware-or software-based
protocols can be more than sufficient. However, in recent years, features such as multi-chip …

Opslaan Citeren Geciteerd door 44 Verwante artikelen Alle 3 versies

[Free GPT-4]
[DeepSeek]

[PDF] google.com

Gps: A global publish-subscribe model for multi-gpu memory management

H Muthukrishnan, D Lustig, D Nellans… - MICRO-54: 54th Annual …, 2021 - dl.acm.org

Suboptimal management of memory and bandwidth is one of the primary causes of low
performance on systems comprising multiple GPUs. Existing memory management solutions …

Opslaan Citeren Geciteerd door 19 Verwante artikelen Alle 3 versies

[Free GPT-4]
[DeepSeek]

[PDF] harinimuthukrishnan.net

Finepack: Transparently improving the efficiency of fine-grained transfers in multi-gpu systems

H Muthukrishnan, D Lustig, O Villa… - … Symposium on High …, 2023 - ieeexplore.ieee.org

Recent studies have shown that using fine-grained peer-to-peer (P2P) stores to
communicate among devices in multi-GPU systems is a promising path to achieve strong …

Opslaan Citeren Geciteerd door 7 Verwante artikelen Alle 3 versies

REC: Enhancing fine-grained cache coherence protocol in multi-GPU systems

G Ko, J Lee, H Kal, H Lee, WW Ro - Journal of Systems Architecture, 2025 - Elsevier

With the increasing demands of modern workloads, multi-GPU systems have emerged as a
scalable solution, extending performance beyond the capabilities of single GPUs. However …

Opslaan Citeren Geciteerd door 1 Verwante artikelen Alle 2 versies

[Free GPT-4]
[DeepSeek]

[PDF] github.io

A survey of architectural approaches for improving GPGPU performance, programmability and heterogeneity

M Khairy, AG Wassal, M Zahran - Journal of Parallel and Distributed …, 2019 - Elsevier

With the skyrocketing advances of process technology, the increased need to process huge
amount of data, and the pivotal need for power efficiency, the usage of Graphics Processing …

Opslaan Citeren Geciteerd door 34 Verwante artikelen Alle 4 versies

[Free GPT-4]
[DeepSeek]

[PDF] ed.ac.uk

Heterogen: Automatic synthesis of heterogeneous cache coherence protocols

N Oswald, V Nagarajan, DJ Sorin… - … Symposium on High …, 2022 - ieeexplore.ieee.org

We solve the two challenges architects face when designing heterogeneous processors with
cache coherent shared memory. First, we develop an automated tool, called HeteroGen, for …

Opslaan Citeren Geciteerd door 11 Verwante artikelen Alle 11 versies

Only buffer when you need to: Reducing on-chip gpu traffic with reconfigurable local atomic buffers

P Dalmia, R Mahapatra… - 2022 IEEE International …, 2022 - ieeexplore.ieee.org

In recent years, due to their wide availability and ease of programming, GPUs have emerged
as the accelerator of choice for a wide variety of applications including graph analytics and …

Opslaan Citeren Geciteerd door 6 Verwante artikelen Alle 2 versies

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Fast fine-grained global synchronization on GPUs

K Wang, D Fussell, C Lin - … of the Twenty-Fourth International Conference …, 2019 - dl.acm.org

This paper extends the reach of General Purpose GPU programming by presenting a
software architecture that supports efficient fine-grained synchronization over global …

Opslaan Citeren Geciteerd door 17 Verwante artikelen Alle 4 versies

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Exploring memory persistency models for gpus

Z Lin, M Alshboul, Y Solihin… - 2019 28th International …, 2019 - ieeexplore.ieee.org

Given its high integration density, high speed, byte addressability, and low standby power,
non-volatile or persistent memory is expected to supplement/replace DRAM as main …

Opslaan Citeren Geciteerd door 15 Verwante artikelen Alle 9 versies

Melding maken

Citeren

Geavanceerd zoeken

Opgeslagen in Mijn bibliotheek

Efficient sequential consistency in gpus via relativistic cache coherence

Distwar: Fast differentiable rendering on raster-based rendering pipelines

Hmg: Extending cache coherence protocols across modern hierarchical multi-gpu systems

Gps: A global publish-subscribe model for multi-gpu memory management

Finepack: Transparently improving the efficiency of fine-grained transfers in multi-gpu systems

REC: Enhancing fine-grained cache coherence protocol in multi-GPU systems

A survey of architectural approaches for improving GPGPU performance, programmability and heterogeneity

Heterogen: Automatic synthesis of heterogeneous cache coherence protocols

Only buffer when you need to: Reducing on-chip gpu traffic with reconfigurable local atomic buffers

Fast fine-grained global synchronization on GPUs

Exploring memory persistency models for gpus