Workload characterization: A survey revisited

MC Calzarossa, L Massari, D Tessera - ACM Computing Surveys (CSUR …, 2016 - dl.acm.org
Workload characterization is a well-established discipline that plays a key role in many
performance engineering studies. The large-scale social behavior inherent in the …

Large-scale cluster management at Google with Borg

A Verma, L Pedrosa, M Korupolu… - Proceedings of the …, 2015 - dl.acm.org
Google's Borg system is a cluster manager that runs hundreds of thousands of jobs, from
many thousands of different applications, across a number of clusters each with up to tens of …

Heterogeneity and dynamicity of clouds at scale: Google trace analysis

C Reiss, A Tumanov, GR Ganger, RH Katz… - Proceedings of the third …, 2012 - dl.acm.org
To better understand the challenges in develo** effective cloud-based resource
schedulers, we analyze the first publicly available trace data from a sizable multi-purpose …

Omega: flexible, scalable schedulers for large compute clusters

M Schwarzkopf, A Konwinski, M Abd-El-Malek… - Proceedings of the 8th …, 2013 - dl.acm.org
Increasing scale and the need for rapid response to changing requirements are hard to meet
with current monolithic cluster scheduler architectures. This restricts the rate at which new …

Sparrow: distributed, low latency scheduling

K Ousterhout, P Wendell, M Zaharia… - Proceedings of the twenty …, 2013 - dl.acm.org
Large-scale data analytics frameworks are shifting towards shorter task durations and larger
degrees of parallelism to provide low latency. Scheduling highly parallel jobs that complete …

Imbalance in the cloud: An analysis on alibaba cluster trace

C Lu, K Ye, G Xu, CZ Xu, T Bai - 2017 IEEE International …, 2017 - ieeexplore.ieee.org
To improve resource efficiency and design intelligent scheduler for clouds, it is necessary to
understand the workload characteristics and machine utilization in large-scale cloud data …

TetriSched: global rescheduling with adaptive plan-ahead in dynamic heterogeneous clusters

A Tumanov, T Zhu, JW Park, MA Kozuch… - Proceedings of the …, 2016 - dl.acm.org
TetriSched is a scheduler that works in tandem with a calendaring reservation system to
continuously re-evaluate the immediate-term scheduling plan for all pending jobs (including …

Where do developers log? an empirical study on logging practices in industry

Q Fu, J Zhu, W Hu, JG Lou, R Ding, Q Lin… - … Proceedings of the …, 2014 - dl.acm.org
System logs are widely used in various tasks of software system management. It is crucial to
avoid logging too little or too much. To achieve so, developers need to make informed …

A cost-efficient container orchestration strategy in kubernetes-based cloud computing infrastructures with heterogeneous resources

Z Zhong, R Buyya - ACM Transactions on Internet Technology (TOIT), 2020 - dl.acm.org
Containers, as a lightweight application virtualization technology, have recently gained
immense popularity in mainstream cluster management systems like Google Borg and …

Dynamic resource allocation for spot markets in cloud computing environments

Q Zhang, Q Zhu, R Boutaba - 2011 Fourth IEEE International …, 2011 - ieeexplore.ieee.org
The advent of cloud computing promises to provide computational resources to customers
like public utilities such as water and electricity. To deal with dynamically fluctuating …