The emergence of essential sparsity in large pre-trained models: The weights that matter
Large pre-trained transformers are $\textit {show-stealer} $ in modern-day deep learning,
and it becomes crucial to comprehend the parsimonious patterns that exist within them as …
and it becomes crucial to comprehend the parsimonious patterns that exist within them as …
Instant soup: Cheap pruning ensembles in a single pass can draw lottery tickets from large models
Large pre-trained transformers have been receiving explosive attention in the past few
years, due to their acculturation for numerous downstream applications via fine-tuning, but …
years, due to their acculturation for numerous downstream applications via fine-tuning, but …
Dynamic sparse no training: Training-free fine-tuning for sparse llms
The ever-increasing large language models (LLMs), though opening a potential path for the
upcoming artificial general intelligence, sadly drops a daunting obstacle on the way towards …
upcoming artificial general intelligence, sadly drops a daunting obstacle on the way towards …
Ten lessons we have learned in the new" sparseland": A short handbook for sparse neural network researchers
This article does not propose any novel algorithm or new hardware for sparsity. Instead, it
aims to serve the" common good" for the increasingly prosperous Sparse Neural Network …
aims to serve the" common good" for the increasingly prosperous Sparse Neural Network …
Learning scalable model soup on a single gpu: An efficient subspace training strategy
Pre-training followed by fine-tuning is widely adopted among practitioners. The performance
can be improved by “model soups” via exploring various hyperparameter configurations …
can be improved by “model soups” via exploring various hyperparameter configurations …
Domain-generalizable multiple-domain clustering
Accurately clustering high-dimensional measurements is vital for adequately analyzing
scientific data. Deep learning machinery has remarkably improved clustering capabilities in …
scientific data. Deep learning machinery has remarkably improved clustering capabilities in …
[HTML][HTML] Federated and edge learning for large language models
As the demand for sophisticated language models (LMs) continues to grow, the necessity to
deploy them efficiently across federated and edge environments becomes increasingly …
deploy them efficiently across federated and edge environments becomes increasingly …
Sequential bayesian neural subnetwork ensembles
Deep ensembles have emerged as a powerful technique for improving predictive
performance and enhancing model robustness across various applications by leveraging …
performance and enhancing model robustness across various applications by leveraging …
SEVEN: Pruning Transformer Model by Reserving Sentinels
Large-scale Transformer models (TM) have demonstrated outstanding performance across
various tasks. However, their considerable parameter size restricts their applicability …
various tasks. However, their considerable parameter size restricts their applicability …
Unveiling the Intertwined Relationship Between Essential Sparsity and Robustness in Large Pre-trained Models
In the era of pre-trained LLMs, understanding their intrinsic sparse patterns becomes
paramount, especially in the context of their scalability and efficiency. Recently, Jaiswal et …
paramount, especially in the context of their scalability and efficiency. Recently, Jaiswal et …