Multi-objective scheduling of many tasks in cloud platforms
The scheduling of a many-task workflow in a distributed computing platform is a well known
NP-hard problem. The problem is even more complex and challenging when the virtualized …
NP-hard problem. The problem is even more complex and challenging when the virtualized …
ZHT: A light-weight reliable persistent dynamic scalable zero-hop distributed hash table
This paper presents ZHT, a zero-hop distributed hash table, which has been tuned for the
requirements of high-end computing systems. ZHT aims to be a building block for future …
requirements of high-end computing systems. ZHT aims to be a building block for future …
Optimizing load balancing and data-locality with data-aware scheduling
Load balancing techniques (eg work stealing) are important to obtain the best performance
for distributed task scheduling systems that have multiple schedulers making scheduling …
for distributed task scheduling systems that have multiple schedulers making scheduling …
Bar: An efficient data locality driven task scheduling algorithm for cloud computing
Large scale data processing is increasingly common in cloud computing systems like
MapReduce, Hadoop, and Dryad in recent years. In these systems, files are split into many …
MapReduce, Hadoop, and Dryad in recent years. In these systems, files are split into many …
Fusionfs: Toward supporting data-intensive scientific applications on extreme-scale high-performance computing systems
State-of-the-art, yet decades-old, architecture of high-performance computing systems has
its compute and storage resources separated. It thus is limited for modern data-intensive …
its compute and storage resources separated. It thus is limited for modern data-intensive …
Enabling scalable scientific workflow management in the Cloud
Cloud computing is gaining tremendous momentum in both academia and industry. In this
context, we define the term “Cloud Workflow” as the specification, execution and provenance …
context, we define the term “Cloud Workflow” as the specification, execution and provenance …
Data replication in data intensive scientific applications with performance guarantee
Data replication has been well adopted in data intensive scientific applications to reduce
data file transfer time and bandwidth consumption. However, the problem of data replication …
data file transfer time and bandwidth consumption. However, the problem of data replication …
Resource allocation and scheduling in cloud computing: Policy and algorithm
Cloud computing is a new distributed commercial computing model that aims at providing
computational resources or services to users over a network in a low-cost manner. Resource …
computational resources or services to users over a network in a low-cost manner. Resource …
Making a case for distributed file systems at exascale
Exascale computers will enable the unraveling of significant scientific mysteries. Predictions
are that 2019 will be the year of exascale, with millions of compute nodes and billions of …
are that 2019 will be the year of exascale, with millions of compute nodes and billions of …
Load‐balanced and locality‐aware scheduling for data‐intensive workloads at extreme scales
Data‐driven programming models such as many‐task computing (MTC) have been
prevalent for running data‐intensive scientific applications. MTC applies over …
prevalent for running data‐intensive scientific applications. MTC applies over …