- Academic Search

RS Kannan, L Subramanian, A Raju, J Ahn… - Proceedings of the …, 2019 - dl.acm.org

The microservice architecture has dramatically reduced user effort in adopting and
maintaining servers by providing a catalog of functions as services that can be used as …

Save Cite Cited by 187 Related articles All 4 versions Free GPT-4

[Free GPT-4]

[PDF] ieee.org

Horus: Interference-aware and prediction-based scheduling in deep learning systems

G Yeung, D Borowiec, R Yang, A Friday… - … on Parallel and …, 2021 - ieeexplore.ieee.org

To accelerate the training of Deep Learning (DL) models, clusters of machines equipped
with hardware accelerators such as GPUs are leveraged to reduce execution time. State-of …

Save Cite Cited by 82 Related articles All 7 versions Free GPT-4

[Free GPT-4]

[PDF] cmu.edu

Managing GPU concurrency in heterogeneous architectures

O Kayiran, NC Nachiappan, A Jog… - 2014 47th annual …, 2014 - ieeexplore.ieee.org

Heterogeneous architectures consisting of general-purpose CPUs and throughput-
optimized GPUs are projected to be the dominant computing platforms for many classes of …

Save Cite Cited by 177 Related articles All 29 versions Free GPT-4

[Free GPT-4]

[PDF] utexas.edu

Anatomy of gpu memory system for multi-application execution

A Jog, O Kayiran, T Kesten, A Pattnaik… - Proceedings of the …, 2015 - dl.acm.org

As GPUs make headway in the computing landscape spanning mobile platforms,
supercomputers, cloud and virtual desktop platforms, supporting concurrent execution of …

Save Cite Cited by 119 Related articles All 12 versions Free GPT-4

[Free GPT-4]

[PDF] psu.edu

Managing DRAM latency divergence in irregular GPGPU applications

N Chatterjee, M O'Connor, GH Loh… - SC'14: Proceedings …, 2014 - ieeexplore.ieee.org

Memory controllers in modern GPUs aggressively reorder requests for high bandwidth
usage, often interleaving requests from different warps. This leads to high variance in the …

Save Cite Cited by 119 Related articles All 9 versions Free GPT-4

[Free GPT-4]

[PDF] github.io

Scheduling page table walks for irregular GPU applications

S Shin, G Cox, M Oskin, GH Loh… - 2018 ACM/IEEE 45th …, 2018 - ieeexplore.ieee.org

Recent studies on commercial hardware demonstrated that irregular GPU applications can
bottleneck on virtual-to-physical address translations. In this work, we explore ways to …

Save Cite Cited by 71 Related articles All 12 versions Free GPT-4

[Free GPT-4]

[PDF] toronto.edu

Zorua: A holistic approach to resource virtualization in GPUs

N Vijaykumar, K Hsieh, G Pekhimenko… - 2016 49th Annual …, 2016 - ieeexplore.ieee.org

This paper introduces a new resource virtualization framework, Zorua, that decouples the
programmer-specified resource usage of a GPU application from the actual allocation in the …

Save Cite Cited by 85 Related articles All 27 versions Free GPT-4

[Free GPT-4]

[PDF] ugent.be

Hsm: A hybrid slowdown model for multitasking gpus

X Zhao, M Jahre, L Eeckhout - … of the twenty-fifth international conference …, 2020 - dl.acm.org

Graphics Processing Units (GPUs) are increasingly widely used in the cloud to accelerate
compute-heavy tasks. However, GPU-compute applications stress the GPU architecture in …

Save Cite Cited by 43 Related articles All 8 versions Free GPT-4

[Free GPT-4]

[PDF] github.io

memif Towards Programming Heterogeneous Memory Asynchronously

FX Lin, X Liu - ACM SIGPLAN Notices, 2016 - dl.acm.org

To harness a heterogeneous memory hierarchy, it is advantageous to integrate application
knowledge in guiding frequent memory move, ie, replicating or migrating virtual memory …

Save Cite Cited by 78 Related articles All 4 versions Free GPT-4

[Free GPT-4]

[PDF] nsf.gov

Efficient and fair multi-programming in GPUs via effective bandwidth management

H Wang, F Luo, M Ibrahim, O Kayiran… - 2018 IEEE International …, 2018 - ieeexplore.ieee.org

Managing the thread-level parallelism (TLP) of GPGPU applications by limiting it to a certain
degree is known to be effective in improving the overall performance. However, we find that …

Save Cite Cited by 56 Related articles All 7 versions Free GPT-4

Create alert

Cite

Advanced search

Saved to My library

Application-aware memory system for fair and efficient execution of concurrent gpgpu applications

Grandslam: Guaranteeing slas for jobs in microservices execution frameworks

Horus: Interference-aware and prediction-based scheduling in deep learning systems

Managing GPU concurrency in heterogeneous architectures

Anatomy of gpu memory system for multi-application execution

Managing DRAM latency divergence in irregular GPGPU applications

Scheduling page table walks for irregular GPU applications

Zorua: A holistic approach to resource virtualization in GPUs

Hsm: A hybrid slowdown model for multitasking gpus

memif Towards Programming Heterogeneous Memory Asynchronously

Efficient and fair multi-programming in GPUs via effective bandwidth management