- Academic Search

R Xu, S Ma, Y Guo, D Li - ACM Computing Surveys, 2023 - dl.acm.org

In recent years, it has been witnessed that the systolic array is a successful architecture for
DNN hardware accelerators. However, the design of systolic arrays also encountered many …

Simpan Kutip Dirujuk 43 kali Artikel terkait 2 versi

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Fact: Ffn-attention co-optimized transformer architecture with eager correlation prediction

Y Qin, Y Wang, D Deng, Z Zhao, X Yang, L Liu… - Proceedings of the 50th …, 2023 - dl.acm.org

Transformer model is becoming prevalent in various AI applications with its outstanding
performance. However, the high cost of computation and memory footprint make its …

Simpan Kutip Dirujuk 47 kali Artikel terkait

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Highlight: Efficient and flexible dnn acceleration with hierarchical structured sparsity

YN Wu, PA Tsai, S Muralidharan, A Parashar… - Proceedings of the 56th …, 2023 - dl.acm.org

Due to complex interactions among various deep neural network (DNN) optimization
techniques, modern DNNs can have weights and activations that are dense or sparse with …

Simpan Kutip Dirujuk 26 kali Artikel terkait 9 versi

[Free GPT-4]
[DeepSeek]

[PDF] google.com

Reconfigurability, why it matters in ai tasks processing: A survey of reconfigurable ai chips

S Wei, X Lin, F Tu, Y Wang, L Liu… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

Nowadays, artificial intelligence (AI) technologies, especially deep neural networks (DNNs),
play an vital role in solving many problems in both academia and industry. In order to …

Simpan Kutip Dirujuk 10 kali Artikel terkait 2 versi

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

ELSA: Exploiting layer-wise n: m sparsity for vision transformer acceleration

NC Huang, CC Chang, WC Lin… - Proceedings of the …, 2024 - openaccess.thecvf.com

N: M sparsity is an emerging model compression method supported by more and more
accelerators to speed up sparse matrix multiplication in deep neural networks. Most existing …

Simpan Kutip Dirujuk 2 kali Artikel terkait 8 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Vegeta: Vertically-integrated extensions for sparse/dense gemm tile acceleration on cpus

G Jeong, S Damani, AR Bambhaniya… - … Symposium on High …, 2023 - ieeexplore.ieee.org

Deep Learning (DL) acceleration support in CPUs has recently gained a lot of traction, with
several companies (Arm, Intel, IBM) announcing products with specialized matrix engines …

Simpan Kutip Dirujuk 20 kali Artikel terkait 4 versi

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

RM-STC: Row-Merge Dataflow Inspired GPU Sparse Tensor Core for Energy-Efficient Sparse Acceleration

G Huang, Z Wang, PA Tsai, C Zhang, Y Ding… - Proceedings of the 56th …, 2023 - dl.acm.org

This paper proposes RM-STC, a novel GPU tensor core architecture designed for sparse
Deep Neural Networks (DNNs) with two key innovations:(1) native support for both training …

Simpan Kutip Dirujuk 9 kali Artikel terkait 3 versi

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Automated HW/SW co-design for edge AI: State, challenges and steps ahead

O Bringmann, W Ecker, I Feldner… - Proceedings of the …, 2021 - dl.acm.org

Gigantic rates of data production in the era of Big Data, Internet of Thing (IoT), and Smart
Cyber Physical Systems (CPS) pose incessantly escalating demands for massive data …

Simpan Kutip Dirujuk 30 kali Artikel terkait 11 versi

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

PDP: parameter-free differentiable pruning is all you need

M Cho, S Adya, D Naik - Advances in Neural Information …, 2024 - proceedings.neurips.cc

DNN pruning is a popular way to reduce the size of a model, improve the inferencelatency,
and minimize the power consumption on DNN accelerators. However, existing approaches …

Simpan Kutip Dirujuk 6 kali Artikel terkait 7 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

ETTE: Efficient tensor-train-based computing engine for deep neural networks

Y Gong, M Yin, L Huang, J **ao, Y Sui, C Deng… - Proceedings of the 50th …, 2023 - dl.acm.org

Tensor-train (TT) decomposition enables ultra-high compression ratio, making the deep
neural network (DNN) accelerators based on this method very attractive. TIE, the state-of-the …

Simpan Kutip Dirujuk 11 kali Artikel terkait 2 versi

Buat notifikasi

Kutip

Penelusuran lanjutan

Disimpan ke Koleksi saya

S2ta: Exploiting structured sparsity for energy-efficient mobile cnn acceleration

A Survey of Design and Optimization for Systolic Array-based DNN Accelerators

Fact: Ffn-attention co-optimized transformer architecture with eager correlation prediction

Highlight: Efficient and flexible dnn acceleration with hierarchical structured sparsity

Reconfigurability, why it matters in ai tasks processing: A survey of reconfigurable ai chips

ELSA: Exploiting layer-wise n: m sparsity for vision transformer acceleration

Vegeta: Vertically-integrated extensions for sparse/dense gemm tile acceleration on cpus

RM-STC: Row-Merge Dataflow Inspired GPU Sparse Tensor Core for Energy-Efficient Sparse Acceleration

Automated HW/SW co-design for edge AI: State, challenges and steps ahead

PDP: parameter-free differentiable pruning is all you need

ETTE: Efficient tensor-train-based computing engine for deep neural networks