Google Наука

J Shalf - Philosophical Transactions of the Royal Society …, 2020 - royalsocietypublishing.org

Moore's Law is a techno-economic model that has enabled the information technology
industry to double the performance and functionality of digital electronics roughly every 2 …

Запазване Позоваване С позовавания в 660 Сродни статии Всички 14 версии

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

A full-stack search technique for domain optimized deep learning accelerators

D Zhang, S Huda, E Songhori, K Prabhu, Q Le… - Proceedings of the 27th …, 2022 - dl.acm.org

The rapidly-changing deep learning landscape presents a unique opportunity for building
inference accelerators optimized for specific datacenter-scale workloads. We propose Full …

Запазване Позоваване С позовавания в 70 Сродни статии Всички 3 версии

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Towards general purpose acceleration by exploiting common data-dependence forms

V Dadu, J Weng, S Liu, T Nowatzki - … of the 52nd Annual IEEE/ACM …, 2019 - dl.acm.org

With slowing technology scaling, specialized accelerators are increasingly attractive
solutions to continue expected generational scaling of performance. However, in order to …

Запазване Позоваване С позовавания в 113 Сродни статии Всички 9 версии

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Tileflow: A framework for modeling fusion dataflow via tree-based analysis

S Zheng, S Chen, S Gao, L Jia, G Sun… - Proceedings of the 56th …, 2023 - dl.acm.org

With the increasing size of DNN models and the growing discrepancy between compute
performance and memory bandwidth, fusing multiple layers together to reduce off-chip …

Запазване Позоваване С позовавания в 16 Сродни статии Всички 5 версии

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Evaluating emerging ai/ml accelerators: Ipu, rdu, and nvidia/amd gpus

H Peng, C Ding, T Geng, S Choudhury… - Companion of the 15th …, 2024 - dl.acm.org

The relentless advancement of artificial intelligence (AI) and machine learning (ML)
applications necessitates the development of specialized hardware accelerators capable of …

Запазване Позоваване С позовавания в 9 Сродни статии Всички 6 версии

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

FCNNLib: A flexible convolution algorithm library for deep learning on FPGAs

Y Liang, Q ** applications to dataflow-based coarse-grained reconfigurable array

AXM Chang, P Khopkar, B Romanous… - arxiv preprint arxiv …, 2022 - arxiv.org

The Streaming Engine (SE) is a Coarse-Grained Reconfigurable Array which provides
programming flexibility and high-performance with energy efficiency. An application program …

Запазване Позоваване С позовавания в 8 Сродни статии Всички 3 версии Във вид на HTML

[Free GPT-4]
[DeepSeek]

[PDF] acm.org Full View

EA4RCA: Efficient AIE accelerator design framework for regular Communication-Avoiding Algorithm

W Zhang, Y Liu, T Zang, Z Bao - ACM Transactions on Architecture and …, 2024 - dl.acm.org

With the introduction of the Adaptive Intelligence Engine (AIE), the Versal Adaptive Compute
Acceleration Platform (Versal ACAP) has garnered great attention. However, the current …

Запазване Позоваване С позовавания в 1 Сродни статии Всички 3 версии

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Squaring the circle: Executing Sparse Matrix Computations on FlexTPU---A TPU-Like Processor

X He, KY Chen, S Feng, HS Kim, D Blaauw… - Proceedings of the …, 2022 - dl.acm.org

Systolic arrays have been successful to accelerate dense linear algebra for deep neural
networks (DNNs), but cannot handle sparse computations efficiently. Though early attempts …

Запазване Позоваване С позовавания в 5 Сродни статии Всички 3 версии

DAP: A 507-GMACs/J 256-Core Domain Adaptive Processor for Wireless Communication and Linear Algebra Kernels in 12-nm FINFET

KY Chen, CS Yang, YH Sun, CW Tseng… - IEEE Journal of Solid …, 2024 - ieeexplore.ieee.org

We present domain adaptive processor (), a programmable systolic-array processor
designed for wireless communication and linear algebra workloads. uses a globally …

Запазване Позоваване Сродни статии Всички 2 версии

Създаване на сигнал

Позоваване

Разширено търсене

Запазено в „Моята библиотека“

Plasticine: A reconfigurable accelerator for parallel patterns

The future of computing beyond Moore's Law

A full-stack search technique for domain optimized deep learning accelerators

Towards general purpose acceleration by exploiting common data-dependence forms

Tileflow: A framework for modeling fusion dataflow via tree-based analysis

Evaluating emerging ai/ml accelerators: Ipu, rdu, and nvidia/amd gpus

FCNNLib: A flexible convolution algorithm library for deep learning on FPGAs

EA4RCA: Efficient AIE accelerator design framework for regular Communication-Avoiding Algorithm

Squaring the circle: Executing Sparse Matrix Computations on FlexTPU---A TPU-Like Processor

DAP: A 507-GMACs/J 256-Core Domain Adaptive Processor for Wireless Communication and Linear Algebra Kernels in 12-nm FINFET