- Academic Search

T Liang, J Glossner, L Wang, S Shi, X Zhang - Neurocomputing, 2021 - Elsevier

Deep neural networks have been applied in many applications exhibiting extraordinary
abilities in the field of computer vision. However, complex network architectures challenge …

Save Cite Cited by 846 Related articles All 6 versions Free GPT-4

[Free GPT-4]

[PDF] mlr.press

Improving neural network quantization without retraining using outlier channel splitting

R Zhao, Y Hu, J Dotzel, C De Sa… - … conference on machine …, 2019 - proceedings.mlr.press

Quantization can improve the execution latency and energy efficiency of neural networks on
both commodity GPUs and specialized accelerators. The majority of existing literature …

Save Cite Cited by 370 Related articles All 7 versions Free GPT-4 View as HTML

On-device learning systems for edge intelligence: A software and hardware synergy perspective

Q Zhou, Z Qu, S Guo, B Luo, J Guo… - IEEE Internet of …, 2021 - ieeexplore.ieee.org

Modern machine learning (ML) applications are often deployed in the cloud environment to
exploit the computational power of clusters. However, this in-cloud computing scheme …

Save Cite Cited by 60 Related articles All 2 versions Free GPT-4

[Free GPT-4]

[PDF] harvard.edu

Algorithm-hardware co-design of adaptive floating-point encodings for resilient deep learning inference

T Tambe, EY Yang, Z Wan, Y Deng… - 2020 57th ACM/IEEE …, 2020 - ieeexplore.ieee.org

Conventional hardware-friendly quantization methods, such as fixed-point or integer, tend to
perform poorly at very low precision as their shrunken dynamic ranges cannot adequately …

Save Cite Cited by 73 Related articles All 12 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Training and inference of large language models using 8-bit floating point

SP Perez, Y Zhang, J Briggs, C Blake… - arxiv preprint arxiv …, 2023 - arxiv.org

FP8 formats are gaining popularity to boost the computational efficiency for training and
inference of large deep learning models. Their main challenge is that a careful choice of …

Save Cite Cited by 12 Related articles All 4 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Low-precision floating-point arithmetic for high-performance fpga-based cnn acceleration

C Wu, M Wang, X Chu, K Wang, L He - ACM Transactions on …, 2021 - dl.acm.org

Low-precision data representation is important to reduce storage size and memory access
for convolutional neural networks (CNNs). Yet, existing methods have two major …

Save Cite Cited by 63 Related articles All 5 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Fighting quantization bias with bias

A Finkelstein, U Almog, M Grobman - arxiv preprint arxiv:1906.03193, 2019 - arxiv.org

Low-precision representation of deep neural networks (DNNs) is critical for efficient
deployment of deep learning application on embedded platforms, however, converting the …

Save Cite Cited by 64 Related articles All 2 versions Free GPT-4 View as HTML

Optimizing FPGA-Based DNN accelerator with shared exponential floating-point format

W Zhao, Q Dang, T **a, J Zhang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

In recent years, low-precision fixed-point computation has become a widely used technique
for neural network inference on FPGAs. However, this approach has some limitations, as …

Save Cite Cited by 7 Related articles

3D-ReG: A 3D ReRAM-based heterogeneous architecture for training deep neural networks

B Li, JR Doppa, PP Pande, K Chakrabarty… - ACM Journal on …, 2020 - dl.acm.org

Deep neural network (DNN) models are being expanded to a broader range of applications.
The computational capability of traditional hardware platforms cannot accommodate the …

Save Cite Cited by 32 Related articles All 2 versions Free GPT-4

[Free GPT-4]

[PDF] aaai.org

Vru pose-ssd: Multiperson pose estimation for automated driving

C Kumar, J Ramesh, B Chakraborty, R Raman… - Proceedings of the …, 2021 - ojs.aaai.org

We present a fast and efficient approach for joint person detection and pose estimation
optimized for automated driving (AD) in urban scenarios. We use a multitask weight sharing …

Save Cite Cited by 20 Related articles All 5 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

Quantizing convolutional neural networks for low-power high-throughput inference engines

Pruning and quantization for deep neural network acceleration: A survey

Improving neural network quantization without retraining using outlier channel splitting

On-device learning systems for edge intelligence: A software and hardware synergy perspective

Algorithm-hardware co-design of adaptive floating-point encodings for resilient deep learning inference

Training and inference of large language models using 8-bit floating point

Low-precision floating-point arithmetic for high-performance fpga-based cnn acceleration

Fighting quantization bias with bias

Optimizing FPGA-Based DNN accelerator with shared exponential floating-point format

3D-ReG: A 3D ReRAM-based heterogeneous architecture for training deep neural networks

Vru pose-ssd: Multiperson pose estimation for automated driving