- Academic Search

Y Qin, Y Wang, D Deng, Z Zhao, X Yang, L Liu… - Proceedings of the 50th …, 2023 - dl.acm.org

Transformer model is becoming prevalent in various AI applications with its outstanding
performance. However, the high cost of computation and memory footprint make its …

Save Cite Cited by 46 Related articles

[Free GPT-4]

[PDF] arxiv.org

SOFA: A compute-memory optimized sparsity accelerator via cross-stage coordinated tiling

H Wang, J Fang, X Tang, Z Yue, J Li… - 2024 57th IEEE/ACM …, 2024 - ieeexplore.ieee.org

Benefiting from the self-attention mechanism, Transformer models have attained impressive
contextual comprehension capabilities for lengthy texts. The requirements of high …

Save Cite Cited by 2 Related articles All 3 versions Free GPT-4

SG-Float: Achieving Memory Access and Computing Power Reduction Using Self-Gating Float in CNNs

JS Wu, TW Hsu, RS Liu - ACM Transactions on Embedded Computing …, 2023 - dl.acm.org

Convolutional neural networks (CNNs) are essential for advancing the field of artificial
intelligence. However, since these networks are highly demanding in terms of memory and …

Save Cite Cited by 2 Related articles

[Free GPT-4]

[PDF] arxiv.org

SySMOL: A Hardware-software Co-design Framework for Ultra-Low and Fine-Grained Mixed-Precision Neural Networks

C Zhou, V Richard, P Savarese, Z Hassman… - arxiv preprint arxiv …, 2023 - arxiv.org

Recent advancements in quantization and mixed-precision techniques offer significant
promise for improving the run-time and energy efficiency of neural networks. In this work, we …

Save Cite Cited by 1 Related articles All 2 versions Free GPT-4 View as HTML

BitWave: Exploiting Column-Based Bit-Level Sparsity for Deep Learning Acceleration

M Shi, V Jain, A Joseph, M Meijer… - 2024 IEEE International …, 2024 - ieeexplore.ieee.org

Bit-serial computation facilitates bit-wise sequential data processing, offering numerous
benefits, such as a reduced area footprint and dynamically-adaptive computational …

Save Cite Cited by 10 Related articles All 2 versions Free GPT-4

[Free GPT-4]

[PDF] ieee.org

Pianissimo: A Sub-mW Class DNN Accelerator With Progressively Adjustable Bit-Precision

J Suzuki, J Yu, M Yasunaga, ÁL García-Arias… - IEEE …, 2023 - ieeexplore.ieee.org

With the widespread adoption of edge AI, the diversity of application requirements and
fluctuating computational demands present significant challenges. Conventional …

Save Cite Cited by 1 Related articles All 2 versions Free GPT-4

Progressive Variable Precision DNN With Bitwise Ternary Accumulation

J Suzuki, M Yasunaga, K Kawamura… - 2024 IEEE 6th …, 2024 - ieeexplore.ieee.org

Progressive variable precision networks are capable of adapting to changing computational
needs over time using a single weight set. However, previous works have two problems: 1) …

Save Cite Related articles

Create alert

Cite

Advanced search

Saved to My library

Ristretto: An atomized processing architecture for sparsity-condensed stream flow in cnn

Fact: Ffn-attention co-optimized transformer architecture with eager correlation prediction

SOFA: A compute-memory optimized sparsity accelerator via cross-stage coordinated tiling

SG-Float: Achieving Memory Access and Computing Power Reduction Using Self-Gating Float in CNNs

SySMOL: A Hardware-software Co-design Framework for Ultra-Low and Fine-Grained Mixed-Precision Neural Networks

BitWave: Exploiting Column-Based Bit-Level Sparsity for Deep Learning Acceleration

Pianissimo: A Sub-mW Class DNN Accelerator With Progressively Adjustable Bit-Precision

Progressive Variable Precision DNN With Bitwise Ternary Accumulation