A simple and effective pruning approach for large language models
As their size increases, Large Languages Models (LLMs) are natural candidates for network
pruning methods: approaches that drop a subset of network weights while striving to …
pruning methods: approaches that drop a subset of network weights while striving to …
Everybody prune now: Structured pruning of llms with only forward passes
Given the generational gap in available hardware between lay practitioners and the most
endowed institutions, LLMs are becoming increasingly inaccessible as they grow in size …
endowed institutions, LLMs are becoming increasingly inaccessible as they grow in size …
FALCON: FLOP-Aware Combinatorial Optimization for Neural Network Pruning
The increasing computational demands of modern neural networks present deployment
challenges on resource-constrained devices. Network pruning offers a solution to reduce …
challenges on resource-constrained devices. Network pruning offers a solution to reduce …
MULTIFLOW: Shifting Towards Task-Agnostic Vision-Language Pruning
While excellent in transfer learning Vision-Language models (VLMs) come with high
computational costs due to their large number of parameters. To address this issue …
computational costs due to their large number of parameters. To address this issue …
Multi-objective evolutionary architectural pruning of deep convolutional neural networks with weights inheritance
Despite the ongoing success of artificial intelligence applications, the deployment of deep
learning models on end devices remains challenging due to the limited onboard …
learning models on end devices remains challenging due to the limited onboard …
The Iterative Optimal Brain Surgeon: Faster Sparse Recovery by Leveraging Second-Order Information
The rising footprint of machine learning has led to a focus on imposing model sparsity as a
means of reducing computational and memory costs. For deep neural networks (DNNs), the …
means of reducing computational and memory costs. For deep neural networks (DNNs), the …
Oats: Outlier-aware pruning through sparse and low rank decomposition
The recent paradigm shift to large-scale foundation models has brought about a new era for
deep learning that, while has found great success in practice, has also been plagued by …
deep learning that, while has found great success in practice, has also been plagued by …
L0learn: A scalable package for sparse learning using l0 regularization
We present L0Learn: an open-source package for sparse linear regression and
classification using ℓ0 regularization. L0Learn implements scalable, approximate …
classification using ℓ0 regularization. L0Learn implements scalable, approximate …
Less is ken: a universal and simple non-parametric pruning algorithm for large language models
Neural network pruning has become increasingly crucial due to the complexity of neural
network models and their widespread use in various fields. Existing pruning algorithms often …
network models and their widespread use in various fields. Existing pruning algorithms often …
LayerMerge: Neural Network Depth Compression through Layer Pruning and Merging
Recent works show that reducing the number of layers in a convolutional neural network can
enhance efficiency while maintaining the performance of the network. Existing depth …
enhance efficiency while maintaining the performance of the network. Existing depth …