Opentuner: An extensible framework for program autotuning

J Ansel, S Kamil, K Veeramachaneni… - Proceedings of the 23rd …, 2014 - dl.acm.org
Program autotuning has been shown to achieve better or more portable performance in a
number of domains. However, autotuners themselves are rarely portable between projects …

A survey on software methods to improve the energy efficiency of parallel computing

C **, BR de Supinski, D Abramson… - … Journal of High …, 2017 - journals.sagepub.com
Energy consumption is one of the top challenges for achieving the next generation of
supercomputing. Codesign of hardware and software is critical for improving energy …

Bliss: auto-tuning complex applications using a pool of diverse lightweight learning models

RB Roy, T Patel, V Gadepally, D Tiwari - Proceedings of the 42nd ACM …, 2021 - dl.acm.org
As parallel applications become more complex, auto-tuning becomes more desirable,
challenging, and time-consuming. We propose, Bliss, a novel solution for auto-tuning …

Taming parallel I/O complexity with auto-tuning

B Behzad, HVT Luu, J Huchette, S Byna… - Proceedings of the …, 2013 - dl.acm.org
We present an auto-tuning system for optimizing I/O performance of HDF5 applications and
demonstrate its value across platforms, applications, and at scale. The system uses a …

Optimizing i/o performance of hpc applications with autotuning

B Behzad, S Byna, Prabhat, M Snir - ACM Transactions on Parallel …, 2019 - dl.acm.org
Parallel Input output is an essential component of modern high-performance computing
(HPC). Obtaining good I/O performance for a broad range of applications on diverse HPC …

[KNYGA][B] Parallel computing hits the power wall: principles, challenges, and a survey of solutions

AF Lorenzon, ACS Beck Filho - 2019 - books.google.com
This book describes several approaches to adaptability that are applied for the optimization
of parallel applications, such as thread-level parallelism exploitation and dynamic voltage …

Applying static analysis to large-scale, multi-threaded Java programs

C Artho, A Biere - Proceedings 2001 Australian Software …, 2001 - ieeexplore.ieee.org
Static analysis is a tremendous help when trying to find faults in complex software. Writing
multi-threaded programs is difficult, because the thread scheduling increases the program …

Multi objective optimization of HPC kernels for performance, power, and energy

P Balaprakash, A Tiwari, SM Wild - … 2013, Denver, CO, USA, November 18 …, 2014 - Springer
Code optimization in the high-performance computing realm has traditionally focused on
reducing execution time. The problem, in mathematical terms, has been expressed as a …

Massively parallel skyline computation for processing-in-memory architectures

V Zois, D Gupta, VJ Tsotras, WA Najjar… - Proceedings of the 27th …, 2018 - dl.acm.org
Processing-In-Memory (PIM) is an increasingly popular architecture aimed at addressing
the'memory wall'crisis by prioritizing the integration of processors within DRAM. It promotes …

INSPIRE: The Insieme parallel intermediate representation

H Jordan, S Pellegrini, P Thoman… - Proceedings of the …, 2013 - ieeexplore.ieee.org
Programming standards like OpenMP, OpenCL and MPI are frequently considered
programming languages for develo** parallel applications for their respective kind of …