- Academic Search

Z Ye, W Gao, Q Hu, P Sun, X Wang, Y Luo… - ACM Computing …, 2024 - dl.acm.org

Deep learning (DL) has demonstrated its remarkable success in a wide variety of fields. The
development of a DL model is a time-consuming and resource-intensive procedure. Hence …

Lưu Trích dẫn Trích dẫn 23 bài viết Bài viết có liên quan Tất cả 5 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Communication-efficient large-scale distributed deep learning: A comprehensive survey

F Liang, Z Zhang, H Lu, V Leung, Y Guo… - arxiv preprint arxiv …, 2024 - arxiv.org

With the rapid growth in the volume of data sets, models, and devices in the domain of deep
learning, there is increasing attention on large-scale distributed deep learning. In contrast to …

Lưu Trích dẫn Trích dẫn 5 bài viết Bài viết có liên quan Tất cả 2 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Clover: Toward sustainable ai with carbon-aware machine learning inference service

B Li, S Samsi, V Gadepally, D Tiwari - Proceedings of the International …, 2023 - dl.acm.org

This paper presents a solution to the challenge of mitigating carbon emissions from hosting
large-scale machine learning (ML) inference services. ML inference is critical to modern …

Lưu Trích dẫn Trích dẫn 34 bài viết Bài viết có liên quan Tất cả 7 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Graft: Efficient inference serving for hybrid deep learning with SLO guarantees via DNN re-alignment

J Wu, L Wang, Q **, F Liu - IEEE Transactions on Parallel and …, 2023 - ieeexplore.ieee.org

Deep neural networks (DNNs) have been widely adopted for various mobile inference tasks,
yet their ever-increasing computational demands are hindering their deployment on …

Lưu Trích dẫn Trích dẫn 14 bài viết Bài viết có liên quan Tất cả 5 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Resource allocation and workload scheduling for large-scale distributed deep learning: A survey

F Liang, Z Zhang, H Lu, C Li, V Leung, Y Guo… - arxiv preprint arxiv …, 2024 - arxiv.org

With rapidly increasing distributed deep learning workloads in large-scale data centers,
efficient distributed deep learning framework strategies for resource allocation and workload …

Lưu Trích dẫn Trích dẫn 7 bài viết Bài viết có liên quan Tất cả 3 phiên bản Xem dạng HTML

Inss: An intelligent scheduling orchestrator for multi-gpu inference with spatio-temporal sharing

Z Han, R Zhou, C Xu, Y Zeng… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

As the applications of AI proliferate, it is critical to increase the throughput of online DNN
inference services. Multi-process service (MPS) improves the utilization rate of GPU …

Lưu Trích dẫn Trích dẫn 6 bài viết Bài viết có liên quan Tất cả 4 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

HarmonyBatch: Batching multi-SLO DNN inference with heterogeneous serverless functions

J Chen, F Xu, Y Gu, L Chen, F Liu… - 2024 IEEE/ACM 32nd …, 2024 - ieeexplore.ieee.org

Deep Neural Network (DNN) inference on serverless functions is gaining prominence due to
its potential for substantial budget savings. Existing works on serverless DNN inference …

Lưu Trích dẫn Trích dẫn 8 bài viết Bài viết có liên quan Tất cả 6 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] hal.science

A stochastic approach for scheduling AI training jobs in GPU-based systems

F Filippini, J Anselmi, D Ardagna… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

In this work, we optimize the scheduling of Deep Learning (DL) training jobs from the
perspective of a Cloud Service Provider running a data center, which efficiently selects …

Lưu Trích dẫn Trích dẫn 7 bài viết Bài viết có liên quan Tất cả 7 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

Reducing datacenter compute carbon footprint by harnessing the power of specialization: Principles, metrics, challenges and opportunities

T Eilam, P Bose, LP Carloni, A Cidon… - IEEE Transactions …, 2024 - ieeexplore.ieee.org

Computing is an indispensable tool in addressing climate change, but it also contributes to a
significant and steadily increasing carbon footprint, partly due to the exponential growth in …

Lưu Trích dẫn Trích dẫn 3 bài viết Bài viết có liên quan Tất cả 2 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Opara: Exploiting Operator Parallelism for Expediting DNN Inference on GPUs

A Chen, F Xu, L Han, Y Dong, L Chen… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

GPUs have become the defacto hardware devices for accelerating Deep Neural Network
(DNN) inference workloads. However, the conventional sequential execution mode of DNN …

Lưu Trích dẫn Trích dẫn 4 bài viết Bài viết có liên quan Tất cả 7 phiên bản

Tạo thông báo

Trích dẫn

Tìm kiếm nâng cao

Đã lưu vào Thư viện của tôi

igniter: Interference-aware gpu resource provisioning for predictable dnn inference in the cloud

Deep learning workload scheduling in gpu datacenters: A survey

Communication-efficient large-scale distributed deep learning: A comprehensive survey

Clover: Toward sustainable ai with carbon-aware machine learning inference service

Graft: Efficient inference serving for hybrid deep learning with SLO guarantees via DNN re-alignment

Resource allocation and workload scheduling for large-scale distributed deep learning: A survey

Inss: An intelligent scheduling orchestrator for multi-gpu inference with spatio-temporal sharing

HarmonyBatch: Batching multi-SLO DNN inference with heterogeneous serverless functions

A stochastic approach for scheduling AI training jobs in GPU-based systems

Reducing datacenter compute carbon footprint by harnessing the power of specialization: Principles, metrics, challenges and opportunities

Opara: Exploiting Operator Parallelism for Expediting DNN Inference on GPUs