Scalable deep learning on distributed infrastructures: Challenges, techniques, and tools
Deep Learning (DL) has had an immense success in the recent past, leading to state-of-the-
art results in various domains, such as image recognition and natural language processing …
art results in various domains, such as image recognition and natural language processing …
Offloading machine learning to programmable data planes: A systematic survey
The demand for machine learning (ML) has increased significantly in recent decades,
enabling several applications, such as speech recognition, computer vision, and …
enabling several applications, such as speech recognition, computer vision, and …
Chasing carbon: The elusive environmental footprint of computing
Given recent algorithm, software, and hardware innovation, computing has enabled a
plethora of new applications. As computing becomes increasingly ubiquitous, however, so …
plethora of new applications. As computing becomes increasingly ubiquitous, however, so …
{INFaaS}: Automated model-less inference serving
Despite existing work in machine learning inference serving, ease-of-use and cost efficiency
remain challenges at large scales. Developers must manually search through thousands of …
remain challenges at large scales. Developers must manually search through thousands of …
{MArk}: Exploiting cloud services for {Cost-Effective},{SLO-Aware} machine learning inference serving
The advances of Machine Learning (ML) have sparked a growing demand of ML-as-a-
Service: developers train ML models and publish them in the cloud as online services to …
Service: developers train ML models and publish them in the cloud as online services to …
Batch: Machine learning inference serving on serverless platforms with adaptive batching
Serverless computing is a new pay-per-use cloud service paradigm that automates resource
scaling for stateless functions and can potentially facilitate bursty machine learning serving …
scaling for stateless functions and can potentially facilitate bursty machine learning serving …
Atoll: A scalable low-latency serverless platform
With user-facing apps adopting serverless computing, good latency performance of
serverless platforms has become a strong fundamental requirement. However, it is difficult to …
serverless platforms has become a strong fundamental requirement. However, it is difficult to …
INFless: a native serverless system for low-latency, high-throughput inference
Modern websites increasingly rely on machine learning (ML) to improve their business
efficiency. Develo** and maintaining ML services incurs high costs for developers …
efficiency. Develo** and maintaining ML services incurs high costs for developers …
Kraken: Adaptive container provisioning for deploying dynamic dags in serverless platforms
The growing popularity of microservices has led to the proliferation of online cloud service-
based applications, which are typically modelled as Directed Acyclic Graphs (DAGs) …
based applications, which are typically modelled as Directed Acyclic Graphs (DAGs) …
Cocktail: A multidimensional optimization for model serving in cloud
With a growing demand for adopting ML models for a variety of application services, it is vital
that the frameworks serving these models are capable of delivering highly accurate …
that the frameworks serving these models are capable of delivering highly accurate …