A survey of methods for analyzing and improving GPU energy efficiency

S Mittal, JS Vetter - ACM Computing Surveys (CSUR), 2014 - dl.acm.org
Recent years have witnessed phenomenal growth in the computational capabilities and
applications of GPUs. However, this trend has also led to a dramatic increase in their power …

Towards high performance paged memory for GPUs

T Zheng, D Nellans, A Zulfiqar… - … Symposium on High …, 2016 - ieeexplore.ieee.org
Despite industrial investment in both on-die GPUs and next generation interconnects, the
highest performing parallel accelerators ship** today continue to be discrete GPUs …

Page placement strategies for GPUs within heterogeneous memory systems

N Agarwal, D Nellans, M Stephenson… - Proceedings of the …, 2015 - dl.acm.org
Systems from smartphones to supercomputers are increasingly heterogeneous, being
composed of both CPUs and GPUs. To maximize cost and energy efficiency, these systems …

Unimem: Runtime data managementon non-volatile memory-based heterogeneous main memory

K Wu, Y Huang, D Li - Proceedings of the International Conference for …, 2017 - dl.acm.org
Non-volatile memory (NVM) provides a scalable and power-efficient solution to replace
DRAM as main memory. However, because of relatively high latency and low bandwidth of …

Hm-ann: Efficient billion-point nearest neighbor search on heterogeneous memory

J Ren, M Zhang, D Li - Advances in Neural Information …, 2020 - proceedings.neurips.cc
The state-of-the-art approximate nearest neighbor search (ANNS) algorithms face a
fundamental tradeoff between query latency and accuracy, because of small main memory …

Opportunities for nonvolatile memory systems in extreme-scale high-performance computing

JS Vetter, S Mittal - Computing in Science & Engineering, 2015 - ieeexplore.ieee.org
For extreme-scale high-performance computing systems, system-wide power consumption
has been identified as one of the key constraints moving forward, where DRAM main …

Approximate storage for energy efficient spintronic memories

A Ranjan, S Venkataramani, X Fong, K Roy… - Proceedings of the …, 2015 - dl.acm.org
Spintronic memories are promising candidates for future on-chip storage due to their high
density, non-volatility and near-zero leakage. However, the energy consumed by read and …

Unlocking bandwidth for GPUs in CC-NUMA systems

N Agarwal, D Nellans, M O'Connor… - 2015 IEEE 21st …, 2015 - ieeexplore.ieee.org
Historically, GPU-based HPC applications have had a substantial memory bandwidth
advantage over CPU-based workloads due to using GDDR rather than DDR memory …

GPU-initiated on-demand high-throughput storage access in the BaM system architecture

Z Qureshi, VS Mailthody, I Gelado, S Min… - Proceedings of the 28th …, 2023 - dl.acm.org
Graphics Processing Units (GPUs) have traditionally relied on the host CPU to initiate
access to the data storage. This approach is well-suited for GPU applications with known …

Runtime data management on non-volatile memory-based heterogeneous memory for task-parallel programs

K Wu, J Ren, D Li - SC18: International Conference for High …, 2018 - ieeexplore.ieee.org
Non-volatile memory (NVM) provides a scalable solution to replace DRAM as main memory.
Because of relatively high latency and low bandwidth of NVM (comparing with DRAM), NVM …