Google Tudós

MMH Shuvo, SK Islam, J Cheng… - Proceedings of the …, 2022 - ieeexplore.ieee.org

Successful integration of deep neural networks (DNNs) or deep learning (DL) has resulted
in breakthroughs in many areas. However, deploying these highly accurate models for data …

Mentés Hivatkozás Idézetek száma: 147 Kapcsolódó cikkek Mind a(z) 5 változat

[Free GPT-4]
[DeepSeek]

[PDF] jmlr.org

Scaling speech technology to 1,000+ languages

V Pratap, A Tjandra, B Shi, P Tomasello, A Babu… - Journal of Machine …, 2024 - jmlr.org

Expanding the language coverage of speech technology has the potential to improve
access to information for many more people. However, current speech technology is …

Mentés Hivatkozás Idézetek száma: 293 Kapcsolódó cikkek Mind a(z) 4 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

XLS-R: Self-supervised cross-lingual speech representation learning at scale

A Babu, C Wang, A Tjandra, K Lakhotia, Q Xu… - arxiv preprint arxiv …, 2021 - arxiv.org

This paper presents XLS-R, a large-scale model for cross-lingual speech representation
learning based on wav2vec 2.0. We train models with up to 2B parameters on nearly half a …

Mentés Hivatkozás Idézetek száma: 719 Kapcsolódó cikkek Mind a(z) 6 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Memorization without overfitting: Analyzing the training dynamics of large language models

K Tirumala, A Markosyan… - Advances in …, 2022 - proceedings.neurips.cc

Despite their wide adoption, the underlying training and memorization dynamics of very
large language models is not well understood. We empirically study exact memorization in …

Mentés Hivatkozás Idézetek száma: 243 Kapcsolódó cikkek Mind a(z) 6 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Scaling laws for generative mixed-modal language models

A Aghajanyan, L Yu, A Conneau… - International …, 2023 - proceedings.mlr.press

Generative language models define distributions over sequences of tokens that can
represent essentially any combination of data modalities (eg, any permutation of image …

Mentés Hivatkozás Idézetek száma: 92 Kapcsolódó cikkek Mind a(z) 6 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] jmlr.org

Scaling up models and data with t5x and seqio

A Roberts, HW Chung, G Mishra, A Levskaya… - Journal of Machine …, 2023 - jmlr.org

Scaling up training datasets and model parameters have benefited neural network-based
language models, but also present challenges like distributed compute, input data …

Mentés Hivatkozás Idézetek száma: 162 Kapcsolódó cikkek Mind a(z) 4 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Cm3: A causal masked multimodal model of the internet

A Aghajanyan, B Huang, C Ross, V Karpukhin… - arxiv preprint arxiv …, 2022 - arxiv.org

We introduce CM3, a family of causally masked generative models trained over a large
corpus of structured multi-modal documents that can contain both text and image tokens …

Mentés Hivatkozás Idézetek száma: 159 Kapcsolódó cikkek Mind a(z) 2 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Colossal-ai: A unified deep learning system for large-scale parallel training

S Li, H Liu, Z Bian, J Fang, H Huang, Y Liu… - Proceedings of the …, 2023 - dl.acm.org

The success of Transformer models has pushed the deep learning model scale to billions of
parameters, but the memory limitation of a single GPU has led to an urgent need for training …

Mentés Hivatkozás Idézetek száma: 139 Kapcsolódó cikkek Mind a(z) 5 változat

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Model compression and efficient inference for large language models: A survey

W Wang, W Chen, Y Luo, Y Long, Z Lin… - arxiv preprint arxiv …, 2024 - arxiv.org

Transformer based large language models have achieved tremendous success. However,
the significant memory and computational costs incurred during the inference process make …

Mentés Hivatkozás Idézetek száma: 41 Kapcsolódó cikkek Mind a(z) 2 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Galvatron: Efficient transformer training over multiple gpus using automatic parallelism

X Miao, Y Wang, Y Jiang, C Shi, X Nie, H Zhang… - arxiv preprint arxiv …, 2022 - arxiv.org

Transformer models have achieved state-of-the-art performance on various domains of
applications and gradually becomes the foundations of the advanced large deep learning …

Mentés Hivatkozás Idézetek száma: 59 Kapcsolódó cikkek Mind a(z) 6 változat HTML-változat

Értesítés létrehozása

Hivatkozás

Speciális keresés

Mentve a Saját könyvtárba

Fairscale: A general purpose modular pytorch library for high performance and large scale training

Efficient acceleration of deep learning inference on resource-constrained edge devices: A review

Scaling speech technology to 1,000+ languages

XLS-R: Self-supervised cross-lingual speech representation learning at scale

Memorization without overfitting: Analyzing the training dynamics of large language models

Scaling laws for generative mixed-modal language models

Scaling up models and data with t5x and seqio

Cm3: A causal masked multimodal model of the internet

Colossal-ai: A unified deep learning system for large-scale parallel training

Model compression and efficient inference for large language models: A survey

Galvatron: Efficient transformer training over multiple gpus using automatic parallelism