- Academic Search

J Kaddour, J Harris, M Mozes, H Bradley… - arxiv preprint arxiv …, 2023 - arxiv.org

Large Language Models (LLMs) went from non-existent to ubiquitous in the machine
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …

Lưu Trích dẫn Trích dẫn 495 bài viết Bài viết có liên quan Tất cả 4 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

Efficient acceleration of deep learning inference on resource-constrained edge devices: A review

MMH Shuvo, SK Islam, J Cheng… - Proceedings of the …, 2022 - ieeexplore.ieee.org

Successful integration of deep neural networks (DNNs) or deep learning (DL) has resulted
in breakthroughs in many areas. However, deploying these highly accurate models for data …

Lưu Trích dẫn Trích dẫn 151 bài viết Bài viết có liên quan Tất cả 5 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Efficient large language models: A survey

Z Wan, X Wang, C Liu, S Alam, Y Zheng, J Liu… - arxiv preprint arxiv …, 2023 - arxiv.org

Large Language Models (LLMs) have demonstrated remarkable capabilities in important
tasks such as natural language understanding and language generation, and thus have the …

Lưu Trích dẫn Trích dẫn 143 bài viết Bài viết có liên quan Tất cả 6 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Scaling vision transformers

X Zhai, A Kolesnikov, N Houlsby… - Proceedings of the …, 2022 - openaccess.thecvf.com

Attention-based neural networks such as the Vision Transformer (ViT) have recently attained
state-of-the-art results on many computer vision benchmarks. Scale is a primary ingredient …

Lưu Trích dẫn Trích dẫn 1293 bài viết Bài viết có liên quan Tất cả 8 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

The effect of choosing optimizer algorithms to improve computer vision tasks: a comparative study

E Hassan, MY Shams, NA Hikal, S Elmougy - Multimedia Tools and …, 2023 - Springer

Optimization algorithms are used to improve model accuracy. The optimization process
undergoes multiple cycles until convergence. A variety of optimization strategies have been …

Lưu Trích dẫn Trích dẫn 154 bài viết Bài viết có liên quan Tất cả 11 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Communication-efficient adaptive federated learning

Y Wang, L Lin, J Chen - International conference on machine …, 2022 - proceedings.mlr.press

Federated learning is a machine learning training paradigm that enables clients to jointly
train models without sharing their own localized data. However, the implementation of …

Lưu Trích dẫn Trích dẫn 110 bài viết Bài viết có liên quan Tất cả 5 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Cocktailsgd: Fine-tuning foundation models over 500mbps networks

J Wang, Y Lu, B Yuan, B Chen… - International …, 2023 - proceedings.mlr.press

Distributed training of foundation models, especially large language models (LLMs), is
communication-intensive and so has heavily relied on centralized data centers with fast …

Lưu Trích dẫn Trích dẫn 39 bài viết Bài viết có liên quan Tất cả 5 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Communication-efficient distributed deep learning: A comprehensive survey

Z Tang, S Shi, W Wang, B Li, X Chu - arxiv preprint arxiv:2003.06307, 2020 - arxiv.org

Distributed deep learning (DL) has become prevalent in recent years to reduce training time
by leveraging multiple computing devices (eg, GPUs/TPUs) due to larger models and …

Lưu Trích dẫn Trích dẫn 163 bài viết Bài viết có liên quan Tất cả 4 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] jmlr.org

Compute-efficient deep learning: Algorithmic trends and opportunities

BR Bartoldson, B Kailkhura, D Blalock - Journal of Machine Learning …, 2023 - jmlr.org

Although deep learning has made great progress in recent years, the exploding economic
and environmental costs of training neural networks are becoming unsustainable. To …

Lưu Trích dẫn Trích dẫn 50 bài viết Bài viết có liên quan Tất cả 4 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Zero++: Extremely efficient collective communication for giant model training

G Wang, H Qin, SA Jacobs, C Holmes… - arxiv preprint arxiv …, 2023 - arxiv.org

Zero Redundancy Optimizer (ZeRO) has been used to train a wide range of large language
models on massive GPUs clusters due to its ease of use, efficiency, and good scalability …

Lưu Trích dẫn Trích dẫn 47 bài viết Bài viết có liên quan Tất cả 5 phiên bản Xem dạng HTML

Tạo thông báo

Trích dẫn

Tìm kiếm nâng cao

Đã lưu vào Thư viện của tôi

1-bit adam: Communication efficient large-scale training with adam’s convergence speed

Challenges and applications of large language models

Efficient acceleration of deep learning inference on resource-constrained edge devices: A review

Efficient large language models: A survey

Scaling vision transformers

The effect of choosing optimizer algorithms to improve computer vision tasks: a comparative study

Communication-efficient adaptive federated learning

Cocktailsgd: Fine-tuning foundation models over 500mbps networks

Communication-efficient distributed deep learning: A comprehensive survey

Compute-efficient deep learning: Algorithmic trends and opportunities

Zero++: Extremely efficient collective communication for giant model training