A survey on automatic parameter tuning for big data processing systems

H Herodotou, Y Chen, J Lu - ACM Computing Surveys (CSUR), 2020 - dl.acm.org
Big data processing systems (eg, Hadoop, Spark, Storm) contain a vast number of
configuration parameters controlling parallelism, I/O behavior, memory settings, and …

Resource management and scheduling in distributed stream processing systems: a taxonomy, review, and future directions

X Liu, R Buyya - ACM Computing Surveys (CSUR), 2020 - dl.acm.org
Stream processing is an emerging paradigm to handle data streams upon arrival, powering
latency-critical application such as fraud detection, algorithmic trading, and health …

Dhalion: self-regulating stream processing in heron

A Floratou, A Agrawal, B Graham, S Rao… - Proceedings of the …, 2017 - dl.acm.org
In recent years, there has been an explosion of large-scale real-time analytics needs and a
plethora of streaming systems have been developed to support such applications. These …

{KungFu}: Making training in distributed machine learning adaptive

L Mai, G Li, M Wagenländer, K Fertakis… - … USENIX Symposium on …, 2020 - usenix.org
When using distributed machine learning (ML) systems to train models on a cluster of worker
machines, users must configure a large number of parameters: hyper-parameters (eg the …

MIRAS: Model-based reinforcement learning for microservice resource allocation over scientific workflows

Z Yang, P Nguyen, H **… - 2019 IEEE 39th …, 2019 - ieeexplore.ieee.org
Microservice, an architectural design that decomposes applications into loosely coupled
services, is adopted in modern software design, including cloud-based scientific workflow …

A review on big data real-time stream processing and its scheduling techniques

N Tantalaki, S Souravlas… - International Journal of …, 2020 - Taylor & Francis
Over the last decade, several interconnected disruptions have happened in the large scale
distributed and parallel computing landscape. The volume of data currently produced by …

Towards scalable edge-native applications

J Wang, Z Feng, S George, R Iyengar, P Pillai… - Proceedings of the 4th …, 2019 - dl.acm.org
Latency-sensitive edge-native applications may be the key to commercial success of edge
infrastructure. However, success in the form of widespread deployment of such applications …

Optimal operator replication and placement for distributed stream processing systems

V Cardellini, V Grassi, F Lo Presti… - ACM SIGMETRICS …, 2017 - dl.acm.org
Exploiting on-the-fly computation, Data Stream Processing (DSP) applications are widely
used to process unbounded streams of data and extract valuable information in a near real …

[HTML][HTML] Pipelined dynamic scheduling of big data streams

S Souravlas, S Anastasiadou - Applied Sciences, 2020 - mdpi.com
We are currently living in the big data era, in which it has become more necessary than ever
to develop “smart” schedulers. It is common knowledge that the default Storm scheduler, as …

DRS: Auto-scaling for real-time stream analytics

TZJ Fu, J Ding, RTB Ma, M Winslett… - IEEE/ACM …, 2017 - ieeexplore.ieee.org
In a stream data analytics system, input data arrive continuously and trigger the processing
and updating of analytics results. We focus on applications with real-time constraints, in …