Griffin: Hardware-software support for efficient page migration in multi-gpu systems

T Baruah, Y Sun, AT Dinçer… - … Symposium on High …, 2020 - ieeexplore.ieee.org
As transistor scaling becomes increasingly more difficult to achieve, scaling the core count
on a single GPU chip has also become extremely challenging. As the volume of data to …

Trans-fw: Short circuiting page table walk in multi-gpu systems via remote forwarding

B Li, J Yin, A Holey, Y Zhang, J Yang… - 2023 IEEE International …, 2023 - ieeexplore.ieee.org
Multi-GPU systems have become a popular platform to meet the ever-growing application
demands. However, employing multiple GPUs does not guarantee proportional performance …

Exploiting adaptive data compression to improve performance and energy-efficiency of compute workloads in multi-GPU systems

MK Tavana, Y Sun, NB Agostini… - 2019 IEEE International …, 2019 - ieeexplore.ieee.org
Graphics Processing Unit (GPU) performance has relied heavily on our ability to scale of
number of transistors on chip, in order to satisfy the ever-increasing demands for more …

A benchmarking framework for interactive 3d applications in the cloud

T Liu, S He, S Huang, D Tsang, L Tang… - 2020 53rd Annual …, 2020 - ieeexplore.ieee.org
With the growing popularity of cloud gaming and cloud virtual reality (VR), interactive 3D
applications have become a major class of workloads for the cloud. However, despite their …

The Parallelization and Optimization of K-means Algorithm Based on MGPUSim

Z Mo, Y Wang, Q Zhang, G Zhang, M Guo… - … Conference on Artificial …, 2022 - Springer
Although the k-means algorithm has been parallelized into different platforms, it has not yet
been explored on multi-GPU architecture thoroughly. This paper presents a study of …

An accurate model to predict the performance of graphical processors using data mining and regression theory

M Shafiabadi, H Pedram, M Reshadi, A Reza - Computers & Electrical …, 2021 - Elsevier
Nowadays the use of graphical processors in fast and accurate scientific calculations has
increased. The heterogeneous design space that is conducted by the processors could …

Halcone: A hardware-level timestamp-based cache coherence scheme for multi-gpu systems

SA Mojumder, Y Sun, L Delshadtehrani, Y Ma… - arxiv preprint arxiv …, 2020 - arxiv.org
While multi-GPU (MGPU) systems are extremely popular for compute-intensive workloads,
several inefficiencies in the memory hierarchy and data movement result in a waste of GPU …

Techniques for optimizing dynamic parallelism on graphics processing units

I El Hajj - 2018 - ideals.illinois.edu
Dynamic parallelism is a feature of general purpose graphics processing units (GPUs)
whereby threads running on a GPU can spawn other threads without CPU intervention. This …

Improving the Virtual Memory Efficiency of GPUs

T Baruah - 2021 - search.proquest.com
GPUs have been adopted widely based their ability to exploit data-level parallelism found in
modern-day applications, ranging from high performance computing to machine learning …

Exploring High Performance Deep Neural Networks on GPUs

S Dong - 2020 - search.proquest.com
Over the past few decades, Machine Learning (ML) has gained unprecedented popularity,
becoming a pervasive technology that has benefitted a broad range of domains such as …