„Google“ mokslinčius

D Liu, H Kong, X Luo, W Liu, R Subramaniam - Neurocomputing, 2022 - Elsevier

Edge computing and artificial intelligence (AI), especially deep learning algorithms, are
gradually intersecting to build the novel system, namely edge intelligence. However, the …

Išsaugoti Cituoti Cituoja 149 Susiję straipsniai Visos 8 versijos

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A white paper on neural network quantization

M Nagel, M Fournarakis, RA Amjad… - arxiv preprint arxiv …, 2021 - arxiv.org

While neural networks have advanced the frontiers in many applications, they often come at
a high computational cost. Reducing the power and latency of neural network inference is …

Išsaugoti Cituoti Cituoja 633 Susiję straipsniai Visos 2 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Pruning vs quantization: Which is better?

A Kuzmin, M Nagel, M Van Baalen… - Advances in neural …, 2023 - proceedings.neurips.cc

Neural network pruning and quantization techniques are almost as old as neural networks
themselves. However, to date, only ad-hoc comparisons between the two have been …

Išsaugoti Cituoti Cituoja 52 Susiję straipsniai Visos 6 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Up or down? adaptive rounding for post-training quantization

M Nagel, RA Amjad, M Van Baalen… - International …, 2020 - proceedings.mlr.press

When quantizing neural networks, assigning each floating-point weight to its nearest fixed-
point value is the predominant approach. We find that, perhaps surprisingly, this is not the …

Išsaugoti Cituoti Cituoja 596 Susiję straipsniai Visos 5 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Overcoming oscillations in quantization-aware training

M Nagel, M Fournarakis… - International …, 2022 - proceedings.mlr.press

When training neural networks with simulated quantization, we observe that quantized
weights can, rather unexpectedly, oscillate between two grid-points. The importance of this …

Išsaugoti Cituoti Cituoja 116 Susiję straipsniai Visos 6 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Understanding and overcoming the challenges of efficient transformer quantization

Y Bondarenko, M Nagel, T Blankevoort - arxiv preprint arxiv:2109.12948, 2021 - arxiv.org

Transformer-based architectures have become the de-facto standard models for a wide
range of Natural Language Processing tasks. However, their memory footprint and high …

Išsaugoti Cituoti Cituoja 145 Susiję straipsniai Visos 4 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Ultra-low precision 4-bit training of deep neural networks

X Sun, N Wang, CY Chen, J Ni… - Advances in …, 2020 - proceedings.neurips.cc

In this paper, we propose a number of novel techniques and numerical representation
formats that enable, for the very first time, the precision of training systems to be aggressively …

Išsaugoti Cituoti Cituoja 230 Susiję straipsniai Visos 8 versijos HTML kopija

A review of state-of-the-art mixed-precision neural network frameworks

M Rakka, ME Fouda, P Khargonekar… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Mixed-precision Deep Neural Networks (DNNs) provide an efficient solution for hardware
deployment, especially under resource constraints, while maintaining model accuracy …

Išsaugoti Cituoti Cituoja 6 Susiję straipsniai Visos 6 versijos

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Fp8 quantization: The power of the exponent

A Kuzmin, M Van Baalen, Y Ren… - Advances in …, 2022 - proceedings.neurips.cc

When quantizing neural networks for efficient inference, low-bit integers are the go-to format
for efficiency. However, low-bit floating point numbers have an extra degree of freedom …

Išsaugoti Cituoti Cituoja 73 Susiję straipsniai Visos 6 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Hybrid 8-bit floating point (HFP8) training and inference for deep neural networks

X Sun, J Choi, CY Chen, N Wang… - Advances in neural …, 2019 - proceedings.neurips.cc

Reducing the numerical precision of data and computation is extremely effective in
accelerating deep learning training workloads. Towards this end, 8-bit floating point …

Išsaugoti Cituoti Cituoja 263 Susiję straipsniai Visos 9 versijos HTML kopija

Kurti įspėjimą

Cituoti

Išplėstinė paieška

Išsaugota skiltyje „Mano biblioteka“

Trained uniform quantization for accurate and efficient neural network inference on fixed-point...

Bringing AI to edge: From deep learning's perspective

A white paper on neural network quantization

Pruning vs quantization: Which is better?

Up or down? adaptive rounding for post-training quantization

Overcoming oscillations in quantization-aware training

Understanding and overcoming the challenges of efficient transformer quantization

Ultra-low precision 4-bit training of deep neural networks

A review of state-of-the-art mixed-precision neural network frameworks

Fp8 quantization: The power of the exponent

Hybrid 8-bit floating point (HFP8) training and inference for deep neural networks