الباحث العلمي من Google

F Chen, S Li, J Han, F Ren, Z Yang - Archives of Computational Methods …, 2024‏ - Springer‏

Lightweight deep convolutional neural networks (LDCNNs) are vital components of mobile
intelligence, particularly in mobile vision. Although various heavy networks with increasingly …‏

حفظ اقتباس تم اقتباسها في عدد: 29 مقالات ذات صلة

[Free GPT-4]
[DeepSeek]

[HTML] frontiersin.org

[HTML][HTML] Applications and techniques for fast machine learning in science‏

AMC Deiana, N Tran, J Agar, M Blott… - Frontiers in big …, 2022‏ - frontiersin.org‏

In this community review report, we discuss applications and techniques for fast machine
learning (ML) in science—the concept of integrating powerful ML methods into the real-time …‏

حفظ اقتباس تم اقتباسها في عدد: 67 مقالات ذات صلة الإصدارات الـ 27كلها بحث عن المكتبات نسخة مخزَّنة مؤقتًا

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Quip: 2-bit quantization of large language models with guarantees‏

J Chee, Y Cai, V Kuleshov… - Advances in Neural …, 2024‏ - proceedings.neurips.cc‏

This work studies post-training parameter quantization in large language models (LLMs).
We introduce quantization with incoherence processing (QuIP), a new method based on the …‏

حفظ اقتباس تم اقتباسها في عدد: 142 مقالات ذات صلة الإصدارات الـ 6كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A survey of quantization methods for efficient neural network inference‏

A Gholami, S Kim, Z Dong, Z Yao… - Low-Power Computer …, 2022‏ - taylorfrancis.com‏

This chapter provides approaches to the problem of quantizing the numerical values in deep
Neural Network computations, covering the advantages/disadvantages of current methods …‏

حفظ اقتباس تم اقتباسها في عدد: 1382 مقالات ذات صلة الإصدارات الـ 4كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Q-diffusion: Quantizing diffusion models‏

X Li, Y Liu, L Lian, H Yang, Z Dong… - Proceedings of the …, 2023‏ - openaccess.thecvf.com‏

Diffusion models have achieved great success in image synthesis through iterative noise
estimation using deep neural networks. However, the slow inference, high memory …‏

حفظ اقتباس تم اقتباسها في عدد: 159 مقالات ذات صلة الإصدارات الـ 6كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Deepspeed-moe: Advancing mixture-of-experts inference and training to power next-generation ai scale‏

S Rajbhandari, C Li, Z Yao, M Zhang… - International …, 2022‏ - proceedings.mlr.press‏

As the training of giant dense models hits the boundary on the availability and capability of
the hardware resources today, Mixture-of-Experts (MoE) models have become one of the …‏

حفظ اقتباس تم اقتباسها في عدد: 260 مقالات ذات صلة الإصدارات الـ 5كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Post-training quantization for vision transformer‏

Z Liu, Y Wang, K Han, W Zhang… - Advances in Neural …, 2021‏ - proceedings.neurips.cc‏

Recently, transformer has achieved remarkable performance on a variety of computer vision
applications. Compared with mainstream convolutional neural networks, vision transformers …‏

حفظ اقتباس تم اقتباسها في عدد: 379 مقالات ذات صلة الإصدارات الـ 7كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Squeezellm: Dense-and-sparse quantization‏

S Kim, C Hooper, A Gholami, Z Dong, X Li… - arxiv preprint arxiv …, 2023‏ - arxiv.org‏

Generative Large Language Models (LLMs) have demonstrated remarkable results for a
wide range of tasks. However, deploying these models for inference has been a significant …‏

حفظ اقتباس تم اقتباسها في عدد: 171 مقالات ذات صلة الإصدارات الـ 4كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Hawq-v3: Dyadic neural network quantization‏

Z Yao, Z Dong, Z Zheng, A Gholami… - International …, 2021‏ - proceedings.mlr.press‏

Current low-precision quantization algorithms often have the hidden cost of conversion back
and forth from floating point to quantized integer values. This hidden cost limits the latency …‏

حفظ اقتباس تم اقتباسها في عدد: 282 مقالات ذات صلة الإصدارات الـ 8كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Zeroq: A novel zero shot quantization framework‏

Y Cai, Z Yao, Z Dong, A Gholami… - Proceedings of the …, 2020‏ - openaccess.thecvf.com‏

Quantization is a promising approach for reducing the inference time and memory footprint
of neural networks. However, most existing quantization methods require access to the …‏

حفظ اقتباس تم اقتباسها في عدد: 477 مقالات ذات صلة الإصدارات الـ 12كلها بحث عن المكتبات إصدار HTML‏

إنشاء تنبيه

اقتباس

بحث متقدم

تم حفظ المقالة في مكتبتي.

Hawq-v2: Hessian aware trace-weighted quantization of neural networks

Review of lightweight deep convolutional neural networks‏

[HTML][HTML] Applications and techniques for fast machine learning in science‏

Quip: 2-bit quantization of large language models with guarantees‏

A survey of quantization methods for efficient neural network inference‏

Q-diffusion: Quantizing diffusion models‏

Deepspeed-moe: Advancing mixture-of-experts inference and training to power next-generation ai scale‏

Post-training quantization for vision transformer‏

Squeezellm: Dense-and-sparse quantization‏

Hawq-v3: Dyadic neural network quantization‏

Zeroq: A novel zero shot quantization framework‏