Benchmarking big data systems: A review

R Han, LK John, J Zhan - IEEE Transactions on Services …, 2017‏ - ieeexplore.ieee.org
With the fast development of big data systems in recent years, a variety of open-source
benchmarks have been built to evaluate and compare the workloads on these systems, and …

Auto: Scaling deep reinforcement learning for datacenter-scale automatic traffic optimization

L Chen, J Lingys, K Chen, F Liu - Proceedings of the 2018 conference of …, 2018‏ - dl.acm.org
Traffic optimizations (TO, eg flow scheduling, load balancing) in datacenters are difficult
online decision-making problems. Previously, they are done with heuristics relying on …

Ernest: Efficient performance prediction for {Large-Scale} advanced analytics

S Venkataraman, Z Yang, M Franklin, B Recht… - … USENIX symposium on …, 2016‏ - usenix.org
Recent workload trends indicate rapid growth in the deployment of machine learning,
genomics and scientific workloads on cloud computing infrastructure. However, efficiently …

Curator:{Self-Managing} Storage for Enterprise Clusters

I Cano, S Aiyar, V Arora, M Bhattacharyya… - … USENIX Symposium on …, 2017‏ - usenix.org
Modern cluster storage systems perform a variety of background tasks to improve the
performance, availability, durability, and cost-efficiency of stored data. For example, cleaners …

Detecting anomaly in big data system logs using convolutional neural network

S Lu, X Wei, Y Li, L Wang - 2018 IEEE 16th Intl Conf on …, 2018‏ - ieeexplore.ieee.org
Nowadays, big data systems are being widely adopted by many domains for offering
effective data solutions, such as manufacturing, healthcare, education, and media. Big data …

Making sense of performance in data analytics frameworks

K Ousterhout, R Rasti, S Ratnasamy… - … USENIX Symposium on …, 2015‏ - usenix.org
There has been much research devoted to improving the performance of data analytics
frameworks, but comparatively little effort has been spent systematically identifying the …

Selecting the best VM across multiple public clouds: a data-driven performance modeling approach

NJ Yadwadkar, B Hariharan, JE Gonzalez… - Proceedings of the …, 2017‏ - dl.acm.org
Users of cloud services are presented with a bewildering choice of VM types and the choice
of VM can have significant implications on performance and cost. In this paper we address …

Machine learning for computer systems and networking: A survey

ME Kanakis, R Khalili, L Wang - ACM Computing Surveys, 2022‏ - dl.acm.org
Machine learning (ML) has become the de-facto approach for various scientific domains
such as computer vision and natural language processing. Despite recent breakthroughs …

Drizzle: Fast and adaptable stream processing at scale

S Venkataraman, A Panda, K Ousterhout… - Proceedings of the 26th …, 2017‏ - dl.acm.org
Large scale streaming systems aim to provide high throughput and low latency. They are
often used to run mission-critical applications, and must be available 24x7. Thus such …

Llama: A heterogeneous & serverless framework for auto-tuning video analytics pipelines

F Romero, M Zhao, NJ Yadwadkar… - Proceedings of the ACM …, 2021‏ - dl.acm.org
The proliferation of camera-enabled devices and large video repositories has led to a
diverse set of video analytics applications. These applications rely on video pipelines …