Architectural support for address translation on gpus: Designing memory management units for cpu/gpus with unified address spaces

B Pichai, L Hsu, A Bhattacharjee - ACM SIGARCH Computer Architecture …, 2014 - dl.acm.org
The proliferation of heterogeneous compute platforms, of which CPU/GPU is a prevalent
example, necessitates a manageable programming model to ensure widespread adoption …

Memory interference characterization between CPU cores and integrated GPUs in mixed-criticality platforms

R Cavicchioli, N Capodieci… - 2017 22nd IEEE …, 2017 - ieeexplore.ieee.org
Most of today's mixed criticality platforms feature Systems on Chip (SoC) where a multi-core
CPU complex (the host) competes with an integrated Graphic Processor Unit (iGPU, the …

Efficient GPU synchronization without scopes: Saying no to complex consistency models

MD Sinclair, J Alsop, SV Adve - … of the 48th International Symposium on …, 2015 - dl.acm.org
As GPUs have become increasingly general purpose, applications with more general
sharing patterns and fine-grained synchronization have started to emerge. Unfortunately …

A survey of architectural approaches for improving GPGPU performance, programmability and heterogeneity

M Khairy, AG Wassal, M Zahran - Journal of Parallel and Distributed …, 2019 - Elsevier
With the skyrocketing advances of process technology, the increased need to process huge
amount of data, and the pivotal need for power efficiency, the usage of Graphics Processing …

Exploring memory consistency for massively-threaded throughput-oriented processors

BA Hechtman, DJ Sorin - Proceedings of the 40th Annual International …, 2013 - dl.acm.org
We re-visit the issue of hardware consistency models in the new context of massively-
threaded throughput-oriented processors (MTTOPs). A prominent example of an MTTOP is a …

Only buffer when you need to: Reducing on-chip gpu traffic with reconfigurable local atomic buffers

P Dalmia, R Mahapatra… - 2022 IEEE International …, 2022 - ieeexplore.ieee.org
In recent years, due to their wide availability and ease of programming, GPUs have emerged
as the accelerator of choice for a wide variety of applications including graph analytics and …

Code generation for embedded heterogeneous architectures on Android

R Membarth, O Reiche, F Hannig… - … Design, Automation & …, 2014 - ieeexplore.ieee.org
The success of Android is based on its unified Java programming model that allows to write
platform-independent programs for a variety of different target platforms. However, this …

An efficient sequential consistency implementation with dynamic race detection for GPUs

A Tabbakh, M Annavaram - Journal of Parallel and Distributed Computing, 2024 - Elsevier
As GPUs are being used for general purpose computations, applications with different
memory access requirements have emerged. In spite of the growing demand, only few GPU …

Address translation for throughput-oriented accelerators

B Pichai, L Hsu, A Bhattacharjee - IEEE Micro, 2015 - ieeexplore.ieee.org
With processor vendors embracing hardware heterogeneity, providing low overhead
hardware and software abstractions to support easy-to-use programming models is a critical …

Fusion coherence: scalable cache coherence for heterogeneous kilo-core system

S Pei, MS Kim, JL Gaudiot, N **ong - Advanced Computer Architecture …, 2014 - Springer
Future heterogeneous systems will integrate CPUs and GPUs on a single chip to achieve
high computing performance as well as high throughput. In general, it would discard the …