- Academic Search

R Xu, S Ma, Y Guo, D Li - ACM Computing Surveys, 2023 - dl.acm.org

In recent years, it has been witnessed that the systolic array is a successful architecture for
DNN hardware accelerators. However, the design of systolic arrays also encountered many …

Salva Cita Citato da 41 Articoli correlati Tutte e 2 le versioni

[Free GPT-4]

[PDF] ieee.org

Stonne: Enabling cycle-level microarchitectural simulation for dnn inference accelerators

F Muñoz-Martínez, JL Abellán… - 2021 IEEE …, 2021 - ieeexplore.ieee.org

The design of specialized architectures for accelerating the inference procedure of Deep
Neural Networks (DNNs) is a booming area of research nowadays. While first-generation …

Salva Cita Citato da 70 Articoli correlati Tutte e 5 le versioni

[Free GPT-4]

[PDF] firoozshahian.com

Mtia: First generation silicon targeting meta's recommendation systems

A Firoozshahian, J Coburn, R Levenstein… - Proceedings of the 50th …, 2023 - dl.acm.org

Meta has traditionally relied on using CPU-based servers for running inference workloads,
specifically Deep Learning Recommendation Models (DLRM), but the increasing compute …

Salva Cita Citato da 33 Articoli correlati Tutte e 4 le versioni

[Free GPT-4]

[PDF] acm.org

Flat: An optimized dataflow for mitigating attention bottlenecks

SC Kao, S Subramanian, G Agrawal… - Proceedings of the 28th …, 2023 - dl.acm.org

Attention mechanisms, primarily designed to capture pairwise correlations between words,
have become the backbone of machine learning, expanding beyond natural language …

Salva Cita Citato da 54 Articoli correlati Tutte e 6 le versioni

Xel: A cloud-agnostic data platform for the design-driven building of high-availability data science services

JA Barron-Lugo, JL Gonzalez-Compean… - Future Generation …, 2023 - Elsevier

This paper presents Xel, a cloud-agnostic data platform for the design-driven building of
high-availability data science services as a support tool for data-driven decision-making. We …

Salva Cita Citato da 11 Articoli correlati Tutte e 2 le versioni

LIBRA: Enabling Workload-Aware Multi-Dimensional Network Topology Optimization for Distributed Training of Large AI Models

W Won, S Rashidi, S Srinivasan… - 2024 IEEE International …, 2024 - ieeexplore.ieee.org

As model sizes in machine learning continue to scale, distributed training is necessary to
accommodate model weights within each device and to reduce training time. However, this …

Salva Cita Citato da 3 Articoli correlati Tutte e 3 le versioni

[Free GPT-4]

[PDF] um.es

STIFT: A spatio-temporal integrated folding tree for efficient reductions in flexible DNN accelerators

F Muñoz-Martínez, JL Abellán, ME Acacio… - ACM Journal on …, 2023 - dl.acm.org

Increasing deployment of Deep Neural Networks (DNNs) recently fueled interest in the
development of specific accelerator architectures capable of meeting their stringent …

Salva Cita Citato da 8 Articoli correlati Tutte e 4 le versioni

[Free GPT-4]

[PDF] arxiv.org

Bifrost: End-to-End Evaluation and Optimization of Reconfigurable DNN Accelerators

A Stjerngren, P Gibson, J Cano - 2022 IEEE International …, 2022 - ieeexplore.ieee.org

Reconfigurable accelerators for deep neural networks (DNNs) promise to improve
performance such as inference latency. STONNE is the first cycle-accurate simulator for …

Salva Cita Citato da 9 Articoli correlati Tutte e 7 le versioni

Neural-Network-Assisted Packet Accelerators for Internet of Things Network Systems

WP Nwadiugwu, W Ejaz, M Kaneko… - IEEE Internet of …, 2023 - ieeexplore.ieee.org

Major device nodes within the Internet of Things (IoT) system collects and store information
in bit forms of 0's and 1's regardless of its repetition. The nodes do not possess the capability …

Salva Cita Citato da 3 Articoli correlati

[Free GPT-4]

[PDF] arxiv.org

Multi-channel medium access control protocols for wireless networks within computing packages

B Ollé, P Talarn, A Cabellos-Aparicio… - … on Circuits and …, 2023 - ieeexplore.ieee.org

Wireless communications at the chip scale emerge as a interesting complement to
traditional wire-based approaches thanks to their low latency, inherent broadcast nature …

Salva Cita Citato da 3 Articoli correlati Tutte e 3 le versioni

Crea avviso

Cita

Ricerca avanzata

Salvato in La mia biblioteca

Data orchestration in deep learning accelerators

A Survey of Design and Optimization for Systolic Array-based DNN Accelerators

Stonne: Enabling cycle-level microarchitectural simulation for dnn inference accelerators

Mtia: First generation silicon targeting meta's recommendation systems

Flat: An optimized dataflow for mitigating attention bottlenecks

Xel: A cloud-agnostic data platform for the design-driven building of high-availability data science services

LIBRA: Enabling Workload-Aware Multi-Dimensional Network Topology Optimization for Distributed Training of Large AI Models

STIFT: A spatio-temporal integrated folding tree for efficient reductions in flexible DNN accelerators

Bifrost: End-to-End Evaluation and Optimization of Reconfigurable DNN Accelerators

Neural-Network-Assisted Packet Accelerators for Internet of Things Network Systems

Multi-channel medium access control protocols for wireless networks within computing packages