Μελετητής Google

Z Ye, W Gao, Q Hu, P Sun, X Wang, Y Luo… - ACM Computing …, 2024 - dl.acm.org

Deep learning (DL) has demonstrated its remarkable success in a wide variety of fields. The
development of a DL model is a time-consuming and resource-intensive procedure. Hence …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 23 Σχετικά άρθρα Όλες οι 5 εκδοχές

A survey on scheduling techniques in computing and network convergence

S Tang, Y Yu, H Wang, G Wang, W Chen… - … Surveys & Tutorials, 2023 - ieeexplore.ieee.org

The computing demand for massive applications has led to the ubiquitous deployment of
computing power. This trend results in the urgent need for higher-level computing resource …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 11 Σχετικά άρθρα Όλες οι 2 εκδοχές

[Free GPT-4]
[DeepSeek]

[PDF] usenix.org

{MLaaS} in the wild: Workload analysis and scheduling in {Large-Scale} heterogeneous {GPU} clusters

Q Weng, W **ao, Y Yu, W Wang, C Wang, J He… - … USENIX Symposium on …, 2022 - usenix.org

With the sustained technological advances in machine learning (ML) and the availability of
massive datasets recently, tech companies are deploying large ML-as-a-Service (MLaaS) …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 300 Σχετικά άρθρα Όλες οι 3 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] usenix.org

Fairness in serving large language models

Y Sheng, S Cao, D Li, B Zhu, Z Li, D Zhuo… - … USENIX Symposium on …, 2024 - usenix.org

High-demand LLM inference services (eg, ChatGPT and BARD) support a wide range of
requests from short chat conversations to long document reading. To ensure that all client …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 41 Σχετικά άρθρα Όλες οι 9 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] usenix.org

Scaling distributed machine learning with {In-Network} aggregation

A Sapio, M Canini, CY Ho, J Nelson, P Kalnis… - … USENIX Symposium on …, 2021 - usenix.org

Training machine learning models in parallel is an increasingly important workload. We
accelerate distributed parallel training by designing a communication primitive that uses a …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 494 Σχετικά άρθρα Όλες οι 20 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] usenix.org

Parrot: Efficient Serving of {LLM-based} Applications with Semantic Variable

C Lin, Z Han, C Zhang, Y Yang, F Yang… - … USENIX Symposium on …, 2024 - usenix.org

The rise of large language models (LLMs) has enabled LLM-based applications (aka AI
agents or co-pilots), a new software paradigm that combines the strength of LLM and …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 16 Σχετικά άρθρα Όλες οι 7 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Characterization and prediction of deep learning workloads in large-scale gpu datacenters

Q Hu, P Sun, S Yan, Y Wen, T Zhang - Proceedings of the International …, 2021 - dl.acm.org

Modern GPU datacenters are critical for delivering Deep Learning (DL) models and services
in both the research community and industry. When operating a datacenter, optimization of …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 146 Σχετικά άρθρα Όλες οι 6 εκδοχές

[Free GPT-4]
[DeepSeek]

[PDF] usenix.org

{MAST}: Global scheduling of {ML} training across {Geo-Distributed} datacenters at hyperscale

A Choudhury, Y Wang, T Pelkonen… - … USENIX Symposium on …, 2024 - usenix.org

In public clouds, users must manually select a datacenter region to upload their ML training
data and launch ML training workloads in the same region to ensure data and computation …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 13 Σχετικά άρθρα Όλες οι 4 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] usenix.org

Beware of fragmentation: Scheduling {GPU-Sharing} workloads with fragmentation gradient descent

Q Weng, L Yang, Y Yu, W Wang, X Tang… - 2023 USENIX Annual …, 2023 - usenix.org

Large tech companies are piling up a massive number of GPUs in their server fleets to run
diverse machine learning (ML) workloads. However, these expensive devices often suffer …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 45 Σχετικά άρθρα Όλες οι 10 εκδοχές Προβολή ως HTML

[Free GPT-4]
[DeepSeek]

[PDF] usenix.org

Pollux: Co-adaptive cluster scheduling for goodput-optimized deep learning

A Qiao, SK Choe, SJ Subramanya… - … on Operating Systems …, 2021 - usenix.org

Pollux improves scheduling performance in deep learning (DL) clusters by adaptively co-
optimizing inter-dependent factors both at the per-job level and at the cluster-wide level …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 205 Σχετικά άρθρα Όλες οι 15 εκδοχές Προβολή ως HTML

Δημιουργία ειδοποίησης

Παράθεση

Σύνθετη αναζήτηση

Αποθηκεύτηκε στη Βιβλιοθήκη μου

{Heterogeneity-Aware} cluster scheduling policies for deep learning workloads

Deep learning workload scheduling in gpu datacenters: A survey

A survey on scheduling techniques in computing and network convergence

{MLaaS} in the wild: Workload analysis and scheduling in {Large-Scale} heterogeneous {GPU} clusters

Fairness in serving large language models

Scaling distributed machine learning with {In-Network} aggregation

Parrot: Efficient Serving of {LLM-based} Applications with Semantic Variable

Characterization and prediction of deep learning workloads in large-scale gpu datacenters

{MAST}: Global scheduling of {ML} training across {Geo-Distributed} datacenters at hyperscale

Beware of fragmentation: Scheduling {GPU-Sharing} workloads with fragmentation gradient descent

Pollux: Co-adaptive cluster scheduling for goodput-optimized deep learning