Cluster resource scheduling in cloud computing: literature review and research challenges
Scheduling plays a pivotal role in cloud computing systems. Designing an efficient
scheduler is a challenging task. The challenge comes from several aspects, including the …
scheduler is a challenging task. The challenge comes from several aspects, including the …
Optimus: an efficient dynamic resource scheduler for deep learning clusters
Deep learning workloads are common in today's production clusters due to the proliferation
of deep learning driven AI services (eg, speech recognition, machine translation). A deep …
of deep learning driven AI services (eg, speech recognition, machine translation). A deep …
Protean:{VM} allocation service at scale
We describe the design and implementation of Protean--the Microsoft Azure service
responsible for allocating Virtual Machines (VMs) to millions of servers around the globe. A …
responsible for allocating Virtual Machines (VMs) to millions of servers around the globe. A …
Firmament: Fast, centralized cluster scheduling at scale
Centralized datacenter schedulers can make high-quality placement decisions when
scheduling tasks in a cluster. Today, however, high-quality placements come at the cost of …
scheduling tasks in a cluster. Today, however, high-quality placements come at the cost of …
Hermod: principled and practical scheduling for serverless functions
Serverless computing has seen rapid growth due to the ease-of-use and cost-efficiency it
provides. However, function scheduling, a critical component of serverless systems, has …
provides. However, function scheduling, a critical component of serverless systems, has …
Machine learning for computer systems and networking: A survey
Machine learning (ML) has become the de-facto approach for various scientific domains
such as computer vision and natural language processing. Despite recent breakthroughs …
such as computer vision and natural language processing. Despite recent breakthroughs …
Fifer: Tackling resource underutilization in the serverless era
Datacenters are witnessing a rapid surge in the adoption of serverless functions for
microservices-based applications. A vast majority of these microservices typically span less …
microservices-based applications. A vast majority of these microservices typically span less …
On the diversity of cluster workloads and its impact on research results
Six years ago, Google released an invaluable set of scheduler logs which has already been
used in more than 450 publications. We find that the scarcity of other data sources, however …
used in more than 450 publications. We find that the scarcity of other data sources, however …
Resource scheduling methods for cloud computing environment: The role of meta-heuristics and artificial intelligence
The growth and development of scientific applications have demanded the creation of
efficient resource management systems. Resource provisioning and scheduling are two …
efficient resource management systems. Resource provisioning and scheduling are two …
{RobinHood}: Tail Latency Aware Caching--Dynamic Reallocation from {Cache-Rich} to {Cache-Poor}
Tail latency is of great importance in user-facing web services. However, maintaining low tail
latency is challenging, because a single request to a web application server results in …
latency is challenging, because a single request to a web application server results in …