Sparsegpt: Massive language models can be accurately pruned in one-shot

E Frantar, D Alistarh - International Conference on Machine …, 2023 - proceedings.mlr.press
We show for the first time that large-scale generative pretrained transformer (GPT) family
models can be pruned to at least 50% sparsity in one-shot, without any retraining, at minimal …

Patch diffusion: Faster and more data-efficient training of diffusion models

Z Wang, Y Jiang, H Zheng, P Wang… - Advances in neural …, 2023 - proceedings.neurips.cc
Diffusion models are powerful, but they require a lot of time and data to train. We propose
Patch Diffusion, a generic patch-wise training framework, to significantly reduce the training …

Efficient spatially sparse inference for conditional gans and diffusion models

M Li, J Lin, C Meng, S Ermon… - Advances in neural …, 2022 - proceedings.neurips.cc
During image editing, existing deep generative models tend to re-synthesize the entire
output from scratch, including the unedited regions. This leads to a significant waste of …

Efficientvit: Lightweight multi-scale attention for high-resolution dense prediction

H Cai, J Li, M Hu, C Gan, S Han - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
High-resolution dense prediction enables many appealing real-world applications, such as
computational photography, autonomous driving, etc. However, the vast computational cost …

Distributed artificial intelligence empowered by end-edge-cloud computing: A survey

S Duan, D Wang, J Ren, F Lyu, Y Zhang… - … Surveys & Tutorials, 2022 - ieeexplore.ieee.org
As the computing paradigm shifts from cloud computing to end-edge-cloud computing, it
also supports artificial intelligence evolving from a centralized manner to a distributed one …

Optimal brain compression: A framework for accurate post-training quantization and pruning

E Frantar, D Alistarh - Advances in Neural Information …, 2022 - proceedings.neurips.cc
We consider the problem of model compression for deep neural networks (DNNs) in the
challenging one-shot/post-training setting, in which we are given an accurate trained model …

On-device training under 256kb memory

J Lin, L Zhu, WM Chen, WC Wang… - Advances in Neural …, 2022 - proceedings.neurips.cc
On-device training enables the model to adapt to new data collected from the sensors by
fine-tuning a pre-trained model. Users can benefit from customized AI models without having …