Learning scheduling algorithms for data processing clusters

H Mao, M Schwarzkopf, SB Venkatakrishnan… - Proceedings of the …, 2019 - dl.acm.org
Efficiently scheduling data processing jobs on distributed compute clusters requires complex
algorithms. Current systems use simple, generalized heuristics and ignore workload …

Open problems in queueing theory inspired by datacenter computing

M Harchol-Balter - Queueing Systems, 2021 - Springer
Datacenter operations today provide a plethora of new queueing and scheduling problems.
The notion of a “job” has become more general and multi-dimensional. The ways in which …

Online Scheduling via Gradient Descent for Weighted Flow Time Minimization

Q Chen, S Im, A Petety - Proceedings of the 2025 Annual ACM-SIAM …, 2025 - SIAM
In this paper, we explore how a natural generalization of Shortest Remaining Processing
Time (SRPT) can be a powerful meta-algorithm for online scheduling. The meta-algorithm …

Optimizing job offloading schedule for collaborative DNN inference

Y Duan, J Wu - IEEE Transactions on Mobile Computing, 2023 - ieeexplore.ieee.org
Deep Neural Networks (DNNs) have been widely deployed in mobile applications. DNN
inference latency is a critical metric to measure the service quality of those applications …

Adaptive scheduling of multiprogrammed dynamic-multithreading applications

Z Wang, C Xu, K Agrawal, J Li - Journal of Parallel and Distributed …, 2022 - Elsevier
Modern parallel platforms, such as clouds or servers, are often shared among many different
jobs. However, existing parallel programming runtime systems are designed and optimized …

Energy-efficient scheduling and routing via randomized rounding

E Bampis, A Kononov, D Letsios, G Lucarelli… - Journal of …, 2018 - Springer
We propose a unifying framework based on configuration linear programs and randomized
rounding, for different energy optimization problems in the dynamic speed-scaling setting …

Optimal resource allocation for elastic and inelastic jobs

B Berg, M Harchol-Balter, B Moseley, W Wang… - Proceedings of the …, 2020 - dl.acm.org
Modern data centers are tasked with processing heterogeneous workloads consisting of
various classes of jobs. These classes differ in their arrival rates, size distributions, and job …

Cloud computing value chains: Research from the operations management perspective

S Chen, K Moinzadeh, JS Song… - … & Service Operations …, 2023 - pubsonline.informs.org
Problem definition: Cloud computing is recognized as a critical driver of information
technology–enabled innovations. The operations management (OM) community, however …

DAG-aware harmonizing job scheduling and data caching for disaggregated analytics frameworks

Y Tong, J Liu, H Wang, M He, K Zhou, R He… - Future Generation …, 2024 - Elsevier
Modern data analytics frameworks often integrate with external storage services, which can
lead to storage bottlenecks. Existing caching and prefetching solutions utilize high-level …

Hierarchy-based algorithms for minimizing makespan under precedence and communication constraints

J Kulkarni, S Li, J Tarnawski, M Ye - … of the Fourteenth Annual ACM-SIAM …, 2020 - SIAM
We consider the classic problem of scheduling jobs with precedence constraints on a set of
identical machines to minimize the makespan objective function. Understanding the exact …