MapReduce scheduling algorithms in Hadoop: a systematic study

S Hedayati, N Maleki, T Olsson, F Ahlgren… - Journal of Cloud …, 2023 - Springer
Hadoop is a framework for storing and processing huge volumes of data on clusters. It uses
Hadoop Distributed File System (HDFS) for storing data and uses MapReduce to process …

Mercury: Hybrid centralized and distributed scheduling in large shared clusters

K Karanasos, S Rao, C Curino, C Douglas… - 2015 USENIX Annual …, 2015 - usenix.org
Datacenter-scale computing for analytics workloads is increasingly common. High
operational costs force heterogeneous applications to share cluster resources for achieving …

Resource-freeing attacks: improve your cloud performance (at your neighbor's expense)

V Varadarajan, T Kooburat, B Farley… - Proceedings of the …, 2012 - dl.acm.org
Cloud computing promises great efficiencies by multiplexing resources among disparate
customers. For example, Amazon's Elastic Compute Cloud (EC2), Microsoft Azure, Google's …

Hierarchical scheduling for diverse datacenter workloads

AA Bhattacharya, D Culler, E Friedman… - Proceedings of the 4th …, 2013 - dl.acm.org
There has been a recent industrial effort to develop multi-resource hierarchical schedulers.
However, the existing implementations have some shortcomings in that they might leave …

Traffic-aware geo-distributed big data analytics with predictable job completion time

P Li, S Guo, T Miyazaki, X Liao, H **… - … on Parallel and …, 2016 - ieeexplore.ieee.org
Big data analytics has attracted close attention from both industry and academic because of
its great benefits in cost reduction and better decision making. As the fast growth of various …

A survey of the state-of-the-art in fair multi-resource allocations for data centers

P Poullie, T Bocek, B Stiller - IEEE Transactions on Network …, 2017 - ieeexplore.ieee.org
Multi-resource allocation in data centers determines a network and service management
task of crucial importance. While, traditionally computing systems are shared based on a …

Hybridmr: A hierarchical mapreduce scheduler for hybrid data centers

B Sharma, T Wood, CR Das - 2013 IEEE 33rd International …, 2013 - ieeexplore.ieee.org
Virtualized environments are attractive because they simplify cluster management, while
facilitating cost-effective workload consolidation. As a result, virtual machines in public …

Improving Hadoop MapReduce performance on heterogeneous single board computer clusters

S Lim, D Park - Future Generation Computer Systems, 2024 - Elsevier
Over the past decade, Apache Hadoop has become a leading framework for big data
processing. Single board computer (SBC) clusters, predominantly adopting Raspberry Pi …

Fresh: Fair and efficient slot configuration and scheduling for hadoop clusters

J Wang, Y Yao, Y Mao, B Sheng… - 2014 IEEE 7th …, 2014 - ieeexplore.ieee.org
Hadoop is an emerging framework for parallel big data processing. While becoming
popular, Hadoop is too complex for regular users to fully understand all the system …

Pythia: Faster big data in motion through predictive software-defined network optimization at runtime

MV Neves, CAF De Rose, K Katrinis… - 2014 IEEE 28th …, 2014 - ieeexplore.ieee.org
The rise of Internet of Things sensors, social networking and mobile devices has led to an
explosion of available data. Gaining insights into this data has led to the area of Big Data …