MapReduce scheduling algorithms in Hadoop: a systematic study
Hadoop is a framework for storing and processing huge volumes of data on clusters. It uses
Hadoop Distributed File System (HDFS) for storing data and uses MapReduce to process …
Hadoop Distributed File System (HDFS) for storing data and uses MapReduce to process …
Mercury: Hybrid centralized and distributed scheduling in large shared clusters
Datacenter-scale computing for analytics workloads is increasingly common. High
operational costs force heterogeneous applications to share cluster resources for achieving …
operational costs force heterogeneous applications to share cluster resources for achieving …
Resource-freeing attacks: improve your cloud performance (at your neighbor's expense)
Cloud computing promises great efficiencies by multiplexing resources among disparate
customers. For example, Amazon's Elastic Compute Cloud (EC2), Microsoft Azure, Google's …
customers. For example, Amazon's Elastic Compute Cloud (EC2), Microsoft Azure, Google's …
Hierarchical scheduling for diverse datacenter workloads
There has been a recent industrial effort to develop multi-resource hierarchical schedulers.
However, the existing implementations have some shortcomings in that they might leave …
However, the existing implementations have some shortcomings in that they might leave …
Traffic-aware geo-distributed big data analytics with predictable job completion time
Big data analytics has attracted close attention from both industry and academic because of
its great benefits in cost reduction and better decision making. As the fast growth of various …
its great benefits in cost reduction and better decision making. As the fast growth of various …
A survey of the state-of-the-art in fair multi-resource allocations for data centers
Multi-resource allocation in data centers determines a network and service management
task of crucial importance. While, traditionally computing systems are shared based on a …
task of crucial importance. While, traditionally computing systems are shared based on a …
Hybridmr: A hierarchical mapreduce scheduler for hybrid data centers
Virtualized environments are attractive because they simplify cluster management, while
facilitating cost-effective workload consolidation. As a result, virtual machines in public …
facilitating cost-effective workload consolidation. As a result, virtual machines in public …
Improving Hadoop MapReduce performance on heterogeneous single board computer clusters
S Lim, D Park - Future Generation Computer Systems, 2024 - Elsevier
Over the past decade, Apache Hadoop has become a leading framework for big data
processing. Single board computer (SBC) clusters, predominantly adopting Raspberry Pi …
processing. Single board computer (SBC) clusters, predominantly adopting Raspberry Pi …
Fresh: Fair and efficient slot configuration and scheduling for hadoop clusters
Hadoop is an emerging framework for parallel big data processing. While becoming
popular, Hadoop is too complex for regular users to fully understand all the system …
popular, Hadoop is too complex for regular users to fully understand all the system …
Pythia: Faster big data in motion through predictive software-defined network optimization at runtime
The rise of Internet of Things sensors, social networking and mobile devices has led to an
explosion of available data. Gaining insights into this data has led to the area of Big Data …
explosion of available data. Gaining insights into this data has led to the area of Big Data …