Autopilot: workload autoscaling at google

K Rzadca, P Findeisen, J Swiderski, P Zych… - Proceedings of the …, 2020‏ - dl.acm.org
In many public and private Cloud systems, users need to specify a limit for the amount of
resources (CPU cores and RAM) to provision for their workloads. A job that exceeds its limits …

Predictive performance modeling for distributed batch processing using black box monitoring and machine learning

C Witt, M Bux, W Gusew, U Leser - Information Systems, 2019‏ - Elsevier
In many domains, the previous decade was characterized by increasing data volumes and
growing complexity of data analyses, creating new demands for batch processing on …

Pocket: Elastic ephemeral storage for serverless analytics

A Klimovic, Y Wang, P Stuedi, A Trivedi… - … USENIX Symposium on …, 2018‏ - usenix.org
Serverless computing is becoming increasingly popular, enabling users to quickly launch
thousands of shortlived tasks in the cloud with high elasticity and fine-grain billing. These …

Performance and cost-efficient spark job scheduling based on deep reinforcement learning in cloud computing environments

MT Islam, S Karunasekera… - IEEE Transactions on …, 2021‏ - ieeexplore.ieee.org
Big data frameworks such as Spark and Hadoop are widely adopted to run analytics jobs in
both research and industry. Cloud offers affordable compute resources which are easier to …

Llama: A heterogeneous & serverless framework for auto-tuning video analytics pipelines

F Romero, M Zhao, NJ Yadwadkar… - Proceedings of the ACM …, 2021‏ - dl.acm.org
The proliferation of camera-enabled devices and large video repositories has led to a
diverse set of video analytics applications. These applications rely on video pipelines …

AlloX: Compute allocation in hybrid clusters

TN Le, X Sun, M Chowdhury, Z Liu - Proceedings of the fifteenth …, 2020‏ - dl.acm.org
Modern deep learning frameworks support a variety of hardware, including CPU, GPU, and
other accelerators, to perform computation. In this paper, we study how to schedule jobs …

Taming performance variability

A Maricq, D Duplyakin, I Jimenez, C Maltzahn… - … USENIX Symposium on …, 2018‏ - usenix.org
The performance of compute hardware varies: software run repeatedly on the same server
(or a different server with supposedly identical parts) can produce performance results that …

Finding Faster Configurations Using FLASH

V Nair, Z Yu, T Menzies, N Siegmund… - IEEE Transactions on …, 2018‏ - ieeexplore.ieee.org
Finding good configurations of a software system is often challenging since the number of
configuration options can be large. Software engineers often make poor choices about …

Smartharvest: Harvesting idle cpus safely and efficiently in the cloud

Y Wang, K Arya, M Kogias, M Vanga… - Proceedings of the …, 2021‏ - dl.acm.org
We can increase the efficiency of public cloud datacenters by harvesting allocated but
temporarily idling CPU cores from customer virtual machines (VMs) to run batch or analytics …

Morphling: Fast, near-optimal auto-configuration for cloud-native model serving

L Wang, L Yang, Y Yu, W Wang, B Li, X Sun… - Proceedings of the …, 2021‏ - dl.acm.org
Machine learning models are widely deployed in production cloud to provide online
inference services. Efficiently deploying inference services requires careful tuning of …