When two choices are not enough: Balancing at scale in distributed stream processing

MAU Nasir, GDF Morales, N Kourtellis… - 2016 IEEE 32nd …, 2016 - ieeexplore.ieee.org
Carefully balancing load in distributed stream processing systems has a fundamental impact
on execution latency and throughput. Load balancing is challenging because real-world …

Orchestrating scheduling, grou** and parallelism to enhance the performance of distributed stream computing system

D Sun, H Chen, S Gao, R Buyya - Expert Systems with Applications, 2024 - Elsevier
In a big data stream computing environment, the arrival rate of data streams usually
fluctuates over time, posing a great challenge to the elasticity of system. The performance of …

Optimizing multiple multi-way stream joins

M Dossinger, S Michel - 2021 IEEE 37th International …, 2021 - ieeexplore.ieee.org
We address the joint optimization of multiple stream joins in a scale-out architecture by
tailoring prior work on multi-way stream joins to predicate-driven data partitioning schemes …

Load balancing for skewed streams on heterogeneous cluster

MAU Nasir, H Horii, M Serafini, N Kourtellis… - arxiv preprint arxiv …, 2017 - arxiv.org
Streaming applications frequently encounter skewed workloads and execute on
heterogeneous clusters. Optimal resource utilization in such adverse conditions becomes a …

Scaling out multi-way stream joins using optimized, iterative probing

M Dossinger, S Michel - … Conference on Big Data (Big Data), 2019 - ieeexplore.ieee.org
We propose MultiStream, a novel multi-way join operator that optimizes tuple-routing
schemes across materialized relations and intermediate results to compute the join results. It …

Aten: A dispatcher for big data applications in heterogeneous systems

PRR de Souza, KJ Matteussi… - … Conference on High …, 2018 - ieeexplore.ieee.org
Stream Processing Engines (SPEs) have to support high data ingestion to ensure the quality
and efficiency for the end-user or a system administrator. The data flow processed by SPE …

GDSW: a general framework for distributed sliding window over data streams

H Chen, Y Wang, Y Wang, X Ma - 2016 IEEE 22nd International …, 2016 - ieeexplore.ieee.org
The big data era is characterized by the emergence of live data with high volume and fast
arrival rate, it poses a new challenge to stream processing applications: how to process the …

Mining big and fast data: algorithms and optimizations for real-time data processing

MAU Nasir - 2018 - diva-portal.org
In the last decade, real-time data processing has attracted much attention from both
academic community and industry, as the meaning of big data has evolved to incorporate as …

System-aware dynamic partitioning for batch and streaming workloads

Z Zvara, PGN Szabó, BB Lóránt… - Proceedings of the 14th …, 2021 - dl.acm.org
When processing data streams with highly skewed and nonstationary key distributions, we
often observe overloaded partitions when the hash partitioning fails to balance data …

[PDF][PDF] Mining Big and Fast Data: Algorithms for Large-Scale Data Processing

MAU Nasir - 2017 - kth.se
In the last decade, most of the research and industry has been focused on processing big
and fast data as massive amount of data has been produced on numerous handheld …