Scalable iterative pruning of large language and vision models using block coordinate descent

G Rosenberg, JK Brubaker, MJA Schuetz… - arxiv preprint arxiv …, 2024 - arxiv.org
Pruning neural networks, which involves removing a fraction of their weights, can often
maintain high accuracy while significantly reducing model complexity, at least up to a certain …