Energy and power aware job scheduling and resource management: Global survey—initial analysis

M Maiterth, G Koenig, K Pedretti, S Jana… - 2018 IEEE …, 2018 - ieeexplore.ieee.org
This work describes the motivation and methodology of a first-of-its-kind global survey of
HPC centers actively employing Energy and Power Aware Scheduling and Resource …

Towards energy budget control in HPC

PF Dutot, Y Georgiou, D Glesser… - 2017 17th IEEE/ACM …, 2017 - ieeexplore.ieee.org
Energy consumption has become one of the mostcritical issues in the evolution of High
Performance Computingsystems (HPC). Controlling the energy consumption of …

Power aware high performance computing: Challenges and opportunities for application and system developers—Survey & tutorial

M Maiterth, T Wilde, D Lowenthal… - … Conference on High …, 2017 - ieeexplore.ieee.org
Power and energy consumption are seen of one of the most critical design factor for any next
generation large-scale HPC system. The price centers have to pay for energy is shifting the …

Global experiences with HPC operational data measurement, collection and analysis

M Ott, W Shin, N Bourassa, T Wilde… - 2020 IEEE …, 2020 - ieeexplore.ieee.org
As we move into the exascale era, supercomputers grow larger, denser, more
heterogeneous, and ever more complex. Operating such machines reliably and efficiently …

Comparing gpu power and frequency cap**: A case study with the mummi workflow

T Patki, Z Frye, H Bhatia, F Di Natale… - 2019 IEEE/ACM …, 2019 - ieeexplore.ieee.org
Accomplishing the goal of exascale computing under a potential power limit requires HPC
clusters to maximize both parallel efficiency and power efficiency. As modern HPC systems …

ECP software technology capability assessment report

MA Heroux, LC McInnes, R Thakur, JS Vetter, XS Li… - 2020 - osti.gov
The Exascale Computing Project (ECP) Software Technology (ST) Focus Area is
responsible for develo** critical software capabilities that will enable successful execution …

A novel approach for job scheduling optimizations under power cap for arm and intel hpc systems

D Rajagopal, D Tafani, Y Georgiou… - 2017 IEEE 24th …, 2017 - ieeexplore.ieee.org
The ever-increasing energy demands of modern High Performance Computing (HPC)
platforms is undeniably one of the most critical aspects for the future design and evolution of …

A unified platform for exploring power management strategies

D Ellsworth, T Patki, M Schulz… - … Workshop on Energy …, 2016 - ieeexplore.ieee.org
Power is quickly becoming a first class resource management concern in HPC. Upcoming
HPC systems will likely be hardware over-provisioned, which will require enhanced power …

DRLCap: Runtime GPU Frequency Cap** with Deep Reinforcement Learning

Y Wang, M Hao, H He, W Zhang, Q Tang… - IEEE Transactions …, 2024 - ieeexplore.ieee.org
Power and energy consumption is the limiting factor of modern computing systems. As the
GPU becomes a mainstream computing device, power management for GPUs becomes …

Monitoring large scale supercomputers: A case study with the lassen supercomputer

T Patki, A Bertsch, I Karlin, DH Ahn… - 2021 IEEE …, 2021 - ieeexplore.ieee.org
Scalable management of user workloads on large-scale supercomputers remains a
challenge due to the tradeoff between capturing adequate detail for analysis from various …