Skyserve: Serving ai models across regions and clouds with spot instances

Z Mao, T **a, Z Wu, WL Chiang, T Griggs… - arxiv preprint arxiv …, 2024 - arxiv.org
Recent years have witnessed an explosive growth of AI models. The high cost of hosting AI
services on GPUs and their demanding service requirements, make it timely and …

[KNYGA][B] Sky Computing with Intercloud Brokers

Z Wu - 2024 - search.proquest.com
In an era where digital infrastructure increasingly relies on cloud computing, the need for
flexible workload migration across clouds has become crucial. This need is particularly …

SpotVerse: Optimizing Bioinformatics Workflows with Multi-Region Spot Instances in Galaxy and Beyond

M Son, GG Akbulut, MT Kandemir - Proceedings of the 25th International …, 2024 - dl.acm.org
As demand for cloud computing in bioinformatics increases, various studies have explored
options for running large-scale workloads with reduced costs, often leveraging spot …

Design and Implementation of a Scalable Financial Exchange in the Public Cloud

M Haseeb, J Geng, D Duclos-Cavalcanti… - arxiv preprint arxiv …, 2024 - arxiv.org
Financial exchanges are migrating to the cloud, but the best-effort nature of the public cloud
is at odds with the stringent latency requirements of exchanges. We present Jasper, a …

Serving Models, Fast and Slow: Optimizing Heterogeneous LLM Inferencing Workloads at Scale

S Jaiswal, K Jain, Y Simmhan, A Parayil… - arxiv preprint arxiv …, 2025 - arxiv.org
Large Language Model (LLM) inference workloads handled by global cloud providers can
include both latency-sensitive and insensitive tasks, creating a diverse range of Service …

Saving Money for Analytical Workloads in the Cloud

T Srivastava, RC Fernandez - arxiv preprint arxiv:2408.00253, 2024 - arxiv.org
As users migrate their analytical workloads to cloud databases, it is becoming just as
important to reduce monetary costs as it is to optimize query runtime. In the cloud, a query is …

[PDF][PDF] An Extensible Architecture for Distributed Heterogeneous Processing

F Luan - 2024 - eecs.berkeley.edu
The fundamental challenge in computer system design has always been reconciling the
growing demands of applications with the constraints of available hardware. As hardware …

Automated and Efficient Multi-Cloud Configuration for Machine Learning Workloads

M Lazuka - 2024 - research-collection.ethz.ch
The last decade of computer science has witnessed rapid development in the field of
machine learning (ML). The growing popularity of ML results in increasing number of ML …