A survey of techniques for managing and leveraging caches in GPUs
S Mittal - Journal of Circuits, Systems, and Computers, 2014 - World Scientific
Initially introduced as special-purpose accelerators for graphics applications, graphics
processing units (GPUs) have now emerged as general purpose computing platforms for a …
processing units (GPUs) have now emerged as general purpose computing platforms for a …
Managing DRAM latency divergence in irregular GPGPU applications
Memory controllers in modern GPUs aggressively reorder requests for high bandwidth
usage, often interleaving requests from different warps. This leads to high variance in the …
usage, often interleaving requests from different warps. This leads to high variance in the …
Scheduling page table walks for irregular GPU applications
Recent studies on commercial hardware demonstrated that irregular GPU applications can
bottleneck on virtual-to-physical address translations. In this work, we explore ways to …
bottleneck on virtual-to-physical address translations. In this work, we explore ways to …
iGPU: exception support and speculative execution on GPUs
Since the introduction of fully programmable vertex shader hardware, GPU computing has
made tremendous advances. Exception support and speculative execution are the next …
made tremendous advances. Exception support and speculative execution are the next …
Microarchitectural performance characterization of irregular GPU kernels
GPUs are increasingly being used to accelerate general-purpose applications, including
applications with data-dependent, irregular memory access patterns and control flow …
applications with data-dependent, irregular memory access patterns and control flow …
Top-down performance profiling on nvidia's gpus
The rise of data-intensive algorithms, such as Machine Learning ones, has meant a strong
diversification of Graphics Processing Units (GPU) in fields with intensive Data-Level …
diversification of Graphics Processing Units (GPU) in fields with intensive Data-Level …
Architecting the last-level cache for GPUs using STT-RAM technology
Future GPUs should have larger L2 caches based on the current trends in VLSI technology
and GPU architectures toward increase of processing core count. Larger L2 caches …
and GPU architectures toward increase of processing core count. Larger L2 caches …
Porting CMP benchmarks to GPUs
GPUs have become increasingly popular in recent years, in large part due to their potential
to offer a large amount of computational power at low prices. They offer massive potential …
to offer a large amount of computational power at low prices. They offer massive potential …
GPGPU workload characteristics and performance analysis
GPUs are much more power-efficient devices compared to CPUs, but due to several
performance bottlenecks, the performance per watt of GPUs is often much lower than what …
performance bottlenecks, the performance per watt of GPUs is often much lower than what …
Study on dual-channel revenue sharing coordination mechanisms based on the free riding
W Ganfu, AI **ng-zheng… - 2009 6th International …, 2009 - ieeexplore.ieee.org
With the rapid development of e-commerce and the adoption of dual channels, free riding
becomes more prevalent than ever before and often results in channel conflict. In this paper …
becomes more prevalent than ever before and often results in channel conflict. In this paper …