- Academic Search

S Kim, C Hooper, T Wattanawong, M Kang… - arxiv preprint arxiv …, 2023 - arxiv.org

Recent advances in state-of-the-art DNN architecture design have been moving toward
Transformer models. These models achieve superior accuracy across a wide range of …

Spara Citera Citerat av 94 Relaterade artiklar Alla 4 versionerna Se som HTML-version

[Free GPT-4]

[PDF] acm.org

Forge: Pre-training open foundation models for science

J Yin, S Dash, F Wang, M Shankar - Proceedings of the International …, 2023 - dl.acm.org

Large language models (LLMs) are poised to revolutionize the way we conduct scientific
research. However, both model complexity and pre-training cost are impeding effective …

Spara Citera Citerat av 15 Relaterade artiklar Alla 5 versionerna

[Free GPT-4]

[PDF] caidongqi.com

[PDF][PDF] Mobile Foundation Model as Firmware

J Yuan, C Yang, D Cai, S Wang, X Yuan… - arxiv preprint arxiv …, 2023 - caidongqi.com

In today's landscape, smartphones have evolved into hubs for hosting a multitude of deep
learning models aimed at local execution. A key realization driving this work is the notable …

Spara Citera Citerat av 16 Relaterade artiklar Alla 2 versionerna Se som HTML-version

[Free GPT-4]

[PDF] arxiv.org

Not all gpus are created equal: characterizing variability in large-scale, accelerator-rich systems

P Sinha, A Guliani, R Jain, B Tran… - … Conference for High …, 2022 - ieeexplore.ieee.org

Scientists are increasingly exploring and utilizing the massive parallelism of general-
purpose accelerators such as GPUs for scientific breakthroughs. As a result, datacenters …

Spara Citera Citerat av 23 Relaterade artiklar Alla 6 versionerna

[Free GPT-4]

[PDF] mit.edu

Mind the gap: Attainable data movement and operational intensity bounds for tensor algorithms

Q Huang, PA Tsai, JS Emer… - 2024 ACM/IEEE 51st …, 2024 - ieeexplore.ieee.org

The architectural design-space exploration (or DSE) process-whether manual or automated-
benefits greatly from knowing the limits of the metrics of interest in advance. Data movement …

Spara Citera Citerat av 3 Relaterade artiklar

[Free GPT-4]

[PDF] arxiv.org

Generative AI beyond LLMs: System implications of multi-modal generation

A Golden, S Hsia, F Sun, B Acun… - … Analysis of Systems …, 2024 - ieeexplore.ieee.org

As the development of large-scale Generative AI models evolve beyond text (1D) generation
to include image (2D) and video (3D) generation, processing spatial and temporal …

Spara Citera Citerat av 10 Relaterade artiklar Alla 2 versionerna

[Free GPT-4]

[PDF] pasalabs.org

Efficient Tensor Offloading for Large Deep-Learning Model Training based on Compute Express Link

D Xu, Y Feng, K Shin, D Kim, H Jeon… - … Conference for High …, 2024 - ieeexplore.ieee.org

The deep learning models (DL) are becoming bigger, easily beyond the memory capacity of
a single accelerator. The recent progress in large DL training utilizes CPU memory as an …

Spara Citera Citerat av 3 Relaterade artiklar

Optimstore: In-storage optimization of large scale dnns with on-die processing

J Kim, M Kang, Y Han, YG Kim… - 2023 IEEE International …, 2023 - ieeexplore.ieee.org

Training deep neural network (DNN) models is a resource-intensive, iterative process. For
this reason, nowadays, complex optimizers like Adam are widely adopted as it increases the …

Spara Citera Citerat av 9 Relaterade artiklar Alla 4 versionerna

[Free GPT-4]

[PDF] arxiv.org

A Heterogeneous Chiplet Architecture for Accelerating End-to-End Transformer Models

H Sharma, P Dhingra, JR Doppa, U Ogras… - arxiv preprint arxiv …, 2023 - arxiv.org

Transformers have revolutionized deep learning and generative modeling, enabling
unprecedented advancements in natural language processing tasks. However, the size of …

Spara Citera Citerat av 5 Relaterade artiklar Alla 2 versionerna Se som HTML-version

[Free GPT-4]

[PDF] imec-publications.be

Amped: An analytical model for performance in distributed training of transformers

D Moolchandani, J Kundu, F Ruelens… - … Analysis of Systems …, 2023 - ieeexplore.ieee.org

Transformers are a class of machine learning models that have piqued high interest recently
due to a multitude of reasons. They can process multiple modalities efficiently and have …

Spara Citera Citerat av 17 Relaterade artiklar Alla 5 versionerna

Citera

Avancerad sökning

Har sparats i Mitt bibliotek

Full stack optimization of transformer inference: a survey

Forge: Pre-training open foundation models for science

[PDF][PDF] Mobile Foundation Model as Firmware

Not all gpus are created equal: characterizing variability in large-scale, accelerator-rich systems

Mind the gap: Attainable data movement and operational intensity bounds for tensor algorithms

Generative AI beyond LLMs: System implications of multi-modal generation

Efficient Tensor Offloading for Large Deep-Learning Model Training based on Compute Express Link

Optimstore: In-storage optimization of large scale dnns with on-die processing

A Heterogeneous Chiplet Architecture for Accelerating End-to-End Transformer Models

Amped: An analytical model for performance in distributed training of transformers