Combining machine learning techniques and genetic algorithm for predicting run times of high performance computing jobs

S Ramachandran, ML Jayalal, M Vasudevan… - Applied Soft …, 2024 - Elsevier
This study proposes a novel approach combining Machine Learning (ML) techniques and
Genetic Algorithms (GA) for predicting High-Performance Computing (HPC) job run times …

Rlschert: An hpc job scheduler using deep reinforcement learning and remaining time prediction

Q Wang, H Zhang, C Qu, Y Shen, X Liu, J Li - Applied Sciences, 2021 - mdpi.com
The job scheduler plays a vital role in high-performance computing platforms. It determines
the execution order of the jobs and the allocation of resources, which in turn affect the …

[PDF][PDF] The University of Chicago

K Zhang - United States, 2024 - knowledge.uchicago.edu
First and foremost, I extend my deepest gratitude to my advisor, Prof. Dacheng **u, who has
guided me since my master's studies. Dacheng has not only been an exceptional academic …

An optimized learning-based directory placement policy with two-rounds selection in distributed file systems

Y Wang, F Yang, K Zhou, C Li, C Liu, J Zhang… - Future Generation …, 2024 - Elsevier
Load balancing is a critical problem in distributed file systems. Previous works focus on
achieving data distribution across nodes at the file-level, often overlooking the potential …

Backfilling HPC jobs with a multimodal-aware predictor

K Lamar, A Goponenko, C Peterson… - 2021 IEEE …, 2021 - ieeexplore.ieee.org
Job scheduling aims to minimize the turnaround time on the submitted jobs while catering to
the resource constraints of High Performance Computing (HPC) systems. The challenge …

Drl-based and bsld-aware job scheduling for apache spark cluster in hybrid cloud computing environments

W Shi, H Li, H Zeng - Journal of Grid Computing, 2022 - Springer
Spark is one of the most important big data computing engines, favored by academia and
industry for its low latency and ease of use. The explosive growth in data volumes is causing …

Prediction of Heterogeneous Device Task Runtime Based on Edge Server-Oriented Deep Neuro-Fuzzy System

H Wu, W Lin, W Shen, X Wang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Predicting the runtime of tasks is of great significance as it can help users better understand
the future runtime consumption of the tasks and make decisions for their heterogeneous …

Toward a Dynamic Allocation Strategy for Deadline‐Oriented Resource and Job Management in HPC Systems

B Linnert, CAF De Rose… - … and Computation: Practice …, 2025 - Wiley Online Library
As high‐performance computing (HPC) becomes a tool used in many different workflows,
quality of service (QoS) becomes increasingly important. In many cases, this includes the …

Light-weight prediction for improving energy consumption in HPC platforms

D Carastan-Santos, G Da Costa, M Poquet… - … Conference on Parallel …, 2024 - Springer
With the increase of demand for computing resources and the struggle to provide the
necessary energy, power-aware resource management is becoming a major issue for the …

LACS: Learning-Augmented Algorithms for Carbon-Aware Resource Scaling with Uncertain Demand

R Bostandoost, A Lechowicz, WA Hanafy… - Proceedings of the 15th …, 2024 - dl.acm.org
Motivated by an imperative to reduce the carbon emissions of cloud data centers, this paper
studies the online carbon-aware resource scaling problem with unknown job lengths …