- Academic Search

Seam: Searching transferable mixed-precision quantization policy through large margin regularization

L Wei, Z Ma, C Yang, Q Yao - Applied Sciences, 2024 - mdpi.com

Artificial intelligence technologies based on deep convolutional neural networks and large
language models have made significant breakthroughs in many tasks, such as image …

Salva Cita Citato da 9 Articoli correlati Tutte e 4 le versioni Copia cache

[Free GPT-4]

[PDF] oup.com

Advances in neural architecture search

X Wang, W Zhu - National Science Review, 2024 - academic.oup.com

Automated machine learning (AutoML) has achieved remarkable success in automating the
non-trivial process of designing machine learning models. Among the focal areas of AutoML …

Salva Cita Citato da 3 Articoli correlati Tutte e 8 le versioni

[Free GPT-4]

[PDF] thecvf.com

Retraining-free model quantization via one-shot weight-coupling learning

C Tang, Y Meng, J Jiang, S **e, R Lu… - Proceedings of the …, 2024 - openaccess.thecvf.com

Quantization is of significance for compressing the over-parameterized deep neural models
and deploying them on resource-limited devices. Fixed-precision quantization suffers from …

Salva Cita Citato da 7 Articoli correlati Tutte e 5 le versioni Versione HTML

[Free GPT-4]

[PDF] openreview.net

Quantization Variation: A New Perspective on Training Transformers with Low-Bit Precision

X Huang, Z Shen, P Dong, KT Cheng - Transactions on Machine …, 2024 - openreview.net

Despite the outstanding performance of transformers in both language and vision tasks, the
expanding computation and model size have increased the demand for efficient …

Salva Cita Citato da 1 Articoli correlati Versione HTML

Hessian-based mixed-precision quantization with transition aware training for neural networks

Z Huang, X Han, Z Yu, Y Zhao, M Hou, S Hu - Neural Networks, 2025 - Elsevier

Abstract Model quantization is widely used to realize the promise of ubiquitous embedded
deep network inference. While mixed-precision quantization has shown promising …

Salva Cita Articoli correlati Tutte e 2 le versioni

[Free GPT-4]

[PDF] arxiv.org

TMPQ-DM: Joint Timestep Reduction and Quantization Precision Selection for Efficient Diffusion Models

H Sun, C Tang, Z Wang, Y Meng, X Ma… - arxiv preprint arxiv …, 2024 - arxiv.org

Diffusion models have emerged as preeminent contenders in the realm of generative
models. Distinguished by their distinctive sequential generative processes, characterized by …

Salva Cita Citato da 3 Articoli correlati Tutte e 2 le versioni Versione HTML

[Free GPT-4]

[PDF] mdpi.com

Bit-Weight Adjustment for Bridging Uniform and Non-Uniform Quantization to Build Efficient Image Classifiers

X Zhou, Y Duan, R Ding, Q Wang, Q Wang, J Qin, H Liu - Electronics, 2023 - mdpi.com

Network quantization, which strives to reduce the precision of model parameters and/or
features, is one of the most efficient ways to accelerate model inference and reduce memory …

Salva Cita Citato da 1 Articoli correlati Tutte e 3 le versioni Copia cache

[Free GPT-4]

[PDF] arxiv.org

Investigating the Impact of Quantization on Adversarial Robustness

Q Li, Y Meng, C Tang, J Jiang, Z Wang - arxiv preprint arxiv:2404.05639, 2024 - arxiv.org

Quantization is a promising technique for reducing the bit-width of deep models to improve
their runtime performance and storage efficiency, and thus becomes a fundamental step for …

Salva Cita Citato da 1 Articoli correlati Tutte e 3 le versioni Versione HTML

[Free GPT-4]

[PDF] arxiv.org

Mixed-Precision Embeddings for Large-Scale Recommendation Models

S Li, Z Hu, X Tang, H Wang, S Xu, W Luo, Y Li… - arxiv preprint arxiv …, 2024 - arxiv.org

Embedding techniques have become essential components of large databases in the deep
learning era. By encoding discrete entities, such as words, items, or graph nodes, into …

Salva Cita Articoli correlati Tutte e 2 le versioni Versione HTML

[Free GPT-4]

[PDF] jst.go.jp

Accelerating CNN Inference with an Adaptive Quantization Method Using Computational Complexity-Aware Regularization

K Nakata, D Miyashita, J Deguchi… - IEICE Transactions on …, 2024 - jstage.jst.go.jp

Quantization is commonly used to reduce the inference time of convolutional neural
networks (CNNs). To reduce the inference time without drastically reducing accuracy …

Salva Cita Articoli correlati Tutte e 5 le versioni

Crea avviso

Cita

Ricerca avanzata

Salvato in La mia biblioteca

Seam: Searching transferable mixed-precision quantization policy through large margin regularization

[HTML][HTML] Advances in the neural network quantization: A comprehensive review

Advances in neural architecture search

Retraining-free model quantization via one-shot weight-coupling learning

Quantization Variation: A New Perspective on Training Transformers with Low-Bit Precision

Hessian-based mixed-precision quantization with transition aware training for neural networks

TMPQ-DM: Joint Timestep Reduction and Quantization Precision Selection for Efficient Diffusion Models

Bit-Weight Adjustment for Bridging Uniform and Non-Uniform Quantization to Build Efficient Image Classifiers

Investigating the Impact of Quantization on Adversarial Robustness

Mixed-Precision Embeddings for Large-Scale Recommendation Models

Accelerating CNN Inference with an Adaptive Quantization Method Using Computational Complexity-Aware Regularization