Multi-objective scheduling of many tasks in cloud platforms

F Zhang, J Cao, K Li, SU Khan, K Hwang - Future Generation Computer …, 2014 - Elsevier
The scheduling of a many-task workflow in a distributed computing platform is a well known
NP-hard problem. The problem is even more complex and challenging when the virtualized …

ZHT: A light-weight reliable persistent dynamic scalable zero-hop distributed hash table

T Li, X Zhou, K Brandstatter, D Zhao… - 2013 IEEE 27th …, 2013 - ieeexplore.ieee.org
This paper presents ZHT, a zero-hop distributed hash table, which has been tuned for the
requirements of high-end computing systems. ZHT aims to be a building block for future …

Optimizing load balancing and data-locality with data-aware scheduling

K Wang, X Zhou, T Li, D Zhao, M Lang… - … Conference on Big …, 2014 - ieeexplore.ieee.org
Load balancing techniques (eg work stealing) are important to obtain the best performance
for distributed task scheduling systems that have multiple schedulers making scheduling …

Bar: An efficient data locality driven task scheduling algorithm for cloud computing

J **, J Luo, A Song, F Dong… - 2011 11th IEEE/ACM …, 2011 - ieeexplore.ieee.org
Large scale data processing is increasingly common in cloud computing systems like
MapReduce, Hadoop, and Dryad in recent years. In these systems, files are split into many …

Fusionfs: Toward supporting data-intensive scientific applications on extreme-scale high-performance computing systems

D Zhao, Z Zhang, X Zhou, T Li, K Wang… - … conference on big …, 2014 - ieeexplore.ieee.org
State-of-the-art, yet decades-old, architecture of high-performance computing systems has
its compute and storage resources separated. It thus is limited for modern data-intensive …

Enabling scalable scientific workflow management in the Cloud

Y Zhao, Y Li, I Raicu, S Lu, W Tian, H Liu - Future Generation Computer …, 2015 - Elsevier
Cloud computing is gaining tremendous momentum in both academia and industry. In this
context, we define the term “Cloud Workflow” as the specification, execution and provenance …

Data replication in data intensive scientific applications with performance guarantee

D Nukarapu, B Tang, L Wang… - IEEE Transactions on …, 2010 - ieeexplore.ieee.org
Data replication has been well adopted in data intensive scientific applications to reduce
data file transfer time and bandwidth consumption. However, the problem of data replication …

Resource allocation and scheduling in cloud computing: Policy and algorithm

T Ma, Y Chu, L Zhao, O Ankhbayar - IETE Technical review, 2014 - Taylor & Francis
Cloud computing is a new distributed commercial computing model that aims at providing
computational resources or services to users over a network in a low-cost manner. Resource …

Making a case for distributed file systems at exascale

I Raicu, IT Foster, P Beckman - … of the third international workshop on …, 2011 - dl.acm.org
Exascale computers will enable the unraveling of significant scientific mysteries. Predictions
are that 2019 will be the year of exascale, with millions of compute nodes and billions of …

Load‐balanced and locality‐aware scheduling for data‐intensive workloads at extreme scales

K Wang, K Qiao, I Sadooghi, X Zhou… - Concurrency and …, 2016 - Wiley Online Library
Data‐driven programming models such as many‐task computing (MTC) have been
prevalent for running data‐intensive scientific applications. MTC applies over …