Scalable system scheduling for HPC and big data

A Reuther, C Byun, W Arcand, D Bestor… - Journal of Parallel and …, 2018 - Elsevier
In the rapidly expanding field of parallel processing, job schedulers are the “operating
systems” of modern big data architectures and supercomputing systems. Job schedulers …

Towards HPC and big data analytics convergence: Design and experimental evaluation of a HPDA framework for escience at scale

D Elia, S Fiore, G Aloisio - IEEE Access, 2021 - ieeexplore.ieee.org
Over the last two decades, scientific discovery has increasingly been driven by the large
availability of data from a multitude of sources, including high-resolution simulations …

Scheduler technologies in support of high performance data analysis

A Reuther, C Byun, W Arcand, D Bestor… - 2016 IEEE High …, 2016 - ieeexplore.ieee.org
Job schedulers are a key component of scalable computing infrastructures. They orchestrate
all of the work executed on the computing infrastructure and directly impact the effectiveness …

Teaching and learning HPC through serious games

J Mullen, L Milechin, D Milechin - Journal of Parallel and Distributed …, 2021 - Elsevier
Serious games provide pathways for learners to develop intuition about concepts that are
new to them. Such games are especially valuable in an educational context because they …

Node-based job scheduling for large scale simulations of short running jobs

C Byun, W Arcand, D Bestor, B Bergeron… - 2021 IEEE High …, 2021 - ieeexplore.ieee.org
Diverse workloads such as interactive supercomputing, big data analysis, and large-scale AI
algorithm development, requires a high-performance scheduler. This paper presents a novel …

Dbos: A proposal for a data-centric operating system

M Cafarella, D DeWitt, V Gadepally, J Kepner… - arxiv preprint arxiv …, 2020 - arxiv.org
Current operating systems are complex systems that were designed before today's
computing environments. This makes it difficult for them to meet the scalability …

Hyperscaling internet graph analysis with d4m on the mit supercloud

V Gadepally, J Kepner, L Milechin… - 2018 IEEE High …, 2018 - ieeexplore.ieee.org
Detecting anomalous behavior in network traffic is a major challenge due to the volume and
velocity of network traffic. For example, a 10 Gigabit Ethernet connection can generate over …

Benchmarking data analysis and machine learning applications on the Intel KNL many-core processor

C Byun, J Kepner, W Arcand, D Bestor… - 2017 IEEE High …, 2017 - ieeexplore.ieee.org
Knights Landing (KNL) is the code name for the second-generation Intel Xeon Phi product
family. KNL has generated significant interest in the data analysis and machine learning …

Processing of crowdsourced observations of aircraft in a high performance computing environment

A Weinert, N Underhill, B Gill… - 2020 IEEE High …, 2020 - ieeexplore.ieee.org
As unmanned aircraft systems (UASs) continue to integrate into the US National Airspace
System (NAS), there is a need to quantify the risk of airborne collisions between unmanned …

Benchmarking the processing of aircraft tracks with triples mode and self-scheduling

A Weinert, M Brittain, N Underhill… - 2021 IEEE High …, 2021 - ieeexplore.ieee.org
As unmanned aircraft systems (UASs) continue to integrate into the US National Airspace
System (NAS), there is a need to quantify the risk of airborne collisions between unmanned …