Google Akademik

S Mittal, S Vaishay - Journal of Systems Architecture, 2019 - Elsevier

The rise of deep-learning (DL) has been fuelled by the improvements in accelerators. Due to
its unique features, the GPU continues to remain the most widely used accelerator for DL …

Kaydet Alıntı yap Alıntılanma sayısı: 211 İlgili makaleler 5 sürümün hepsi

A survey on convolutional neural network accelerators: GPU, FPGA and ASIC

Y Hu, Y Liu, Z Liu - 2022 14th International Conference on …, 2022 - ieeexplore.ieee.org

In recent years, artificial intelligence (AI) has been under rapid development, applied in
various areas. Among a vast number of neural network (NN) models, the convolutional …

Kaydet Alıntı yap Alıntılanma sayısı: 77 İlgili makaleler

[Free GPT-4]

[PDF] acm.org

Varuna: scalable, low-cost training of massive deep learning models

S Athlur, N Saran, M Sivathanu, R Ramjee… - Proceedings of the …, 2022 - dl.acm.org

Systems for training massive deep learning models (billions of parameters) today assume
and require specialized" hyperclusters": hundreds or thousands of GPUs wired with …

Kaydet Alıntı yap Alıntılanma sayısı: 116 İlgili makaleler 9 sürümün hepsi

[Free GPT-4]

[PDF] microsoft.com

Estimating GPU memory consumption of deep learning models

Y Gao, Y Liu, H Zhang, Z Li, Y Zhu, H Lin… - Proceedings of the 28th …, 2020 - dl.acm.org

Deep learning (DL) has been increasingly adopted by a variety of software-intensive
systems. Developers mainly use GPUs to accelerate the training, testing, and deployment of …

Kaydet Alıntı yap Alıntılanma sayısı: 153 İlgili makaleler 2 sürümün hepsi

[Free GPT-4]

[PDF] mlsys.org

Checkmate: Breaking the memory wall with optimal tensor rematerialization

P Jain, A Jain, A Nrusimha, A Gholami… - Proceedings of …, 2020 - proceedings.mlsys.org

Modern neural networks are increasingly bottlenecked by the limited capacity of on-device
GPU memory. Prior work explores drop** activations as a strategy to scale to larger neural …

Kaydet Alıntı yap Alıntılanma sayısı: 203 İlgili makaleler 7 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]

[PDF] acm.org

Swapadvisor: Pushing deep learning beyond the gpu memory limit via smart swap**

CC Huang, G **, J Li - Proceedings of the Twenty-Fifth International …, 2020 - dl.acm.org

It is known that deeper and wider neural networks can achieve better accuracy. But it is
difficult to continue the trend to increase model size due to limited GPU memory. One …

Kaydet Alıntı yap Alıntılanma sayısı: 207 İlgili makaleler 4 sürümün hepsi

[Free GPT-4]

[PDF] usc.edu

Capuchin: Tensor-based gpu memory management for deep learning

X Peng, X Shi, H Dai, H **, W Ma, Q **ong… - Proceedings of the …, 2020 - dl.acm.org

In recent years, deep learning has gained unprecedented success in various domains, the
key of the success is the larger and deeper deep neural networks (DNNs) that achieved very …

Kaydet Alıntı yap Alıntılanma sayısı: 179 İlgili makaleler 2 sürümün hepsi

[Free GPT-4]

[PDF] openreview.net

EXACT: Scalable graph neural networks training via extreme activation compression

Z Liu, K Zhou, F Yang, L Li, R Chen… - … Conference on Learning …, 2021 - openreview.net

Training Graph Neural Networks (GNNs) on large graphs is a fundamental challenge due to
the high memory usage, which is mainly occupied by activations (eg, node embeddings) …

Kaydet Alıntı yap Alıntılanma sayısı: 62 İlgili makaleler HTML olarak görüntüle

[Free GPT-4]

[PDF] github.io

Melon: Breaking the memory wall for resource-efficient on-device machine learning

Q Wang, M Xu, C **, X Dong, J Yuan, X **… - Proceedings of the 20th …, 2022 - dl.acm.org

On-device learning is a promising technique for emerging privacy-preserving machine
learning paradigms. However, through quantitative experiments, we find that commodity …

Kaydet Alıntı yap Alıntılanma sayısı: 61 İlgili makaleler 5 sürümün hepsi

[Free GPT-4]

[PDF] arxiv.org

Llmcad: Fast and scalable on-device large language model inference

D Xu, W Yin, X **, Y Zhang, S Wei, M Xu… - arxiv preprint arxiv …, 2023 - arxiv.org

Generative tasks, such as text generation and question answering, hold a crucial position in
the realm of mobile applications. Due to their sensitivity to privacy concerns, there is a …

Kaydet Alıntı yap Alıntılanma sayısı: 49 İlgili makaleler 2 sürümün hepsi HTML olarak görüntüle

Uyarı oluştur

Alıntı yap

Gelişmiş arama

Kitaplığım'a kaydedildi

Training deeper models by GPU memory optimization on TensorFlow

A survey of techniques for optimizing deep learning on GPUs

A survey on convolutional neural network accelerators: GPU, FPGA and ASIC

Varuna: scalable, low-cost training of massive deep learning models

Estimating GPU memory consumption of deep learning models

Checkmate: Breaking the memory wall with optimal tensor rematerialization

Swapadvisor: Pushing deep learning beyond the gpu memory limit via smart swap**

Capuchin: Tensor-based gpu memory management for deep learning

EXACT: Scalable graph neural networks training via extreme activation compression

Melon: Breaking the memory wall for resource-efficient on-device machine learning

Llmcad: Fast and scalable on-device large language model inference