Addressing failures in exascale computing

M Snir, RW Wisniewski, JA Abraham… - … Journal of High …, 2014 - journals.sagepub.com
We present here a report produced by a workshop on 'Addressing failures in exascale
computing'held in Park City, Utah, 4–11 August 2012. The charter of this workshop was to …

HermitCore: a unikernel for extreme scale computing

S Lankes, S Pickartz, J Breitbart - … of the 6th International Workshop on …, 2016 - dl.acm.org
We expect that the size and the complexity of future supercomputers will increase on their
path to exascale systems and beyond. Therefore, system software has to adapt to the …

mOS: An architecture for extreme-scale operating systems

RW Wisniewski, T Inglett, P Keppel, R Murty… - Proceedings of the 4th …, 2014 - dl.acm.org
Linux®, or more specifically, the Linux API, plays a key role in HPC computing. Even for
extreme-scale computing, a known and familiar API is required for production machines …

On the scalability, performance isolation and device driver transparency of the IHK/McKernel hybrid lightweight kernel

B Gerofi, M Takagi, A Hori, G Nakamura… - 2016 IEEE …, 2016 - ieeexplore.ieee.org
Extreme degree of parallelism in high-end computing requires low operating system noise
so that large scale, bulk-synchronous parallel applications can be run efficiently. Noiseless …

Achieving performance isolation with lightweight co-kernels

J Ouyang, B Kocoloski, JR Lange… - Proceedings of the 24th …, 2015 - dl.acm.org
Performance isolation is emerging as a requirement for High Performance Computing (HPC)
applications, particularly as HPC architectures turn to in situ data processing and application …

A system software approach to proactive memory-error avoidance

CHA Costa, Y Park, BS Rosenburg… - SC'14: Proceedings …, 2014 - ieeexplore.ieee.org
Today's HPC systems use two mechanisms to address main-memory errors. Error-correcting
codes make correctable errors transparent to software, while checkpoint/restart (CR) …

Hexo: Offloading hpc compute-intensive workloads on low-cost, low-power embedded systems

P Olivier, AKMF Mehrab, S Lankes… - Proceedings of the 28th …, 2019 - dl.acm.org
OS-capable embedded systems exhibiting a very low power consumption are available at
an extremely low price point. It makes them highly compelling in a datacenter context. In this …

Interface for heterogeneous kernels: A framework to enable hybrid OS designs targeting high performance computing on manycore architectures

T Shimosawa, B Gerofi, M Takagi… - … Conference on High …, 2014 - ieeexplore.ieee.org
Turning towards exascale systems and beyond, it has been widely argued that the currently
available systems software is not going to be feasible due to various requirements such as …

HEXO: Offloading long-running compute-and memory-intensive workloads on low-cost, low-power embedded systems

P Olivier, AKMF Mehrab, S Errabelly… - … on Cloud Computing, 2024 - ieeexplore.ieee.org
OS-capable embedded systems exhibiting a very low power consumption are available at
an extremely low price point. It makes them highly compelling in a datacenter context. We …

Performance and scalability of lightweight multi-kernel based operating systems

B Gerofi, R Riesen, M Takagi, T Boku… - 2018 IEEE …, 2018 - ieeexplore.ieee.org
Multi-kernels leverage today's multi-core chips to run multiple operating system (OS)
kernels, typically a Light Weight Kernel (LWK) and a Linux kernel, simultaneously. The LWK …