A comprehensive overview of large language models
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …
natural language processing tasks and beyond. This success of LLMs has led to a large …
Model compression and hardware acceleration for neural networks: A comprehensive survey
Domain-specific hardware is becoming a promising topic in the backdrop of improvement
slow down for general-purpose processors due to the foreseeable end of Moore's Law …
slow down for general-purpose processors due to the foreseeable end of Moore's Law …
A survey of quantization methods for efficient neural network inference
This chapter provides approaches to the problem of quantizing the numerical values in deep
Neural Network computations, covering the advantages/disadvantages of current methods …
Neural Network computations, covering the advantages/disadvantages of current methods …
Ptq4vit: Post-training quantization for vision transformers with twin uniform quantization
Quantization is one of the most effective methods to compress neural networks, which has
achieved great success on convolutional neural networks (CNNs). Recently, vision …
achieved great success on convolutional neural networks (CNNs). Recently, vision …
Lut-gemm: Quantized matrix multiplication based on luts for efficient inference in large-scale generative language models
The recent advancements in self-supervised learning, combined with the Transformer
architecture, have enabled natural language processing (NLP) to achieve remarkably low …
architecture, have enabled natural language processing (NLP) to achieve remarkably low …
Clip-q: Deep network compression learning by in-parallel pruning-quantization
Deep neural networks enable state-of-the-art accuracy on visual recognition tasks such as
image classification and object detection. However, modern deep networks contain millions …
image classification and object detection. However, modern deep networks contain millions …
Review of lightweight deep convolutional neural networks
F Chen, S Li, J Han, F Ren, Z Yang - Archives of Computational Methods …, 2024 - Springer
Lightweight deep convolutional neural networks (LDCNNs) are vital components of mobile
intelligence, particularly in mobile vision. Although various heavy networks with increasingly …
intelligence, particularly in mobile vision. Although various heavy networks with increasingly …
A survey on methods and theories of quantized neural networks
Y Guo - arxiv preprint arxiv:1808.04752, 2018 - arxiv.org
Deep neural networks are the state-of-the-art methods for many real-world tasks, such as
computer vision, natural language processing and speech recognition. For all its popularity …
computer vision, natural language processing and speech recognition. For all its popularity …
Compression of deep learning models for text: A survey
In recent years, the fields of natural language processing (NLP) and information retrieval (IR)
have made tremendous progress thanks to deep learning models like Recurrent Neural …
have made tremendous progress thanks to deep learning models like Recurrent Neural …
Structured binary neural networks for accurate image classification and semantic segmentation
In this paper, we propose to train convolutional neural networks (CNNs) with both binarized
weights and activations, leading to quantized models specifically for mobile devices with …
weights and activations, leading to quantized models specifically for mobile devices with …