[HTML][HTML] A review of green artificial intelligence: Towards a more sustainable future

V Bolón-Canedo, L Morán-Fernández, B Cancela… - Neurocomputing, 2024‏ - Elsevier
Green artificial intelligence (AI) is more environmentally friendly and inclusive than
conventional AI, as it not only produces accurate results without increasing the …

A survey of techniques for optimizing transformer inference

KT Chitty-Venkata, S Mittal, M Emani… - Journal of Systems …, 2023‏ - Elsevier
Recent years have seen a phenomenal rise in the performance and applications of
transformer neural networks. The family of transformer networks, including Bidirectional …

Smoothquant: Accurate and efficient post-training quantization for large language models

G **ao, J Lin, M Seznec, H Wu… - International …, 2023‏ - proceedings.mlr.press
Large language models (LLMs) show excellent performance but are compute-and memory-
intensive. Quantization can reduce memory and accelerate inference. However, existing …

Rethinking vision transformers for mobilenet size and speed

Y Li, J Hu, Y Wen, G Evangelidis… - Proceedings of the …, 2023‏ - openaccess.thecvf.com
With the success of Vision Transformers (ViTs) in computer vision tasks, recent arts try to
optimize the performance and complexity of ViTs to enable efficient deployment on mobile …

Quip: 2-bit quantization of large language models with guarantees

J Chee, Y Cai, V Kuleshov… - Advances in Neural …, 2023‏ - proceedings.neurips.cc
This work studies post-training parameter quantization in large language models (LLMs).
We introduce quantization with incoherence processing (QuIP), a new method based on the …

Zeroquant: Efficient and affordable post-training quantization for large-scale transformers

Z Yao, R Yazdani Aminabadi… - Advances in …, 2022‏ - proceedings.neurips.cc
How to efficiently serve ever-larger trained natural language models in practice has become
exceptionally challenging even for powerful cloud servers due to their prohibitive …

Tinyvit: Fast pretraining distillation for small vision transformers

K Wu, J Zhang, H Peng, M Liu, B **ao, J Fu… - European conference on …, 2022‏ - Springer
Vision transformer (ViT) recently has drawn great attention in computer vision due to its
remarkable model capability. However, most prevailing ViT models suffer from huge number …

Shortgpt: Layers in large language models are more redundant than you expect

X Men, M Xu, Q Zhang, B Wang, H Lin, Y Lu… - arxiv preprint arxiv …, 2024‏ - arxiv.org
As Large Language Models (LLMs) continue to advance in performance, their size has
escalated significantly, with current LLMs containing billions or even trillions of parameters …

Aging with grace: Lifelong model editing with discrete key-value adaptors

T Hartvigsen, S Sankaranarayanan… - Advances in …, 2023‏ - proceedings.neurips.cc
Deployed language models decay over time due to shifting inputs, changing user needs, or
emergent world-knowledge gaps. When such problems are identified, we want to make …

A survey on vision transformer

K Han, Y Wang, H Chen, X Chen, J Guo… - IEEE transactions on …, 2022‏ - ieeexplore.ieee.org
Transformer, first applied to the field of natural language processing, is a type of deep neural
network mainly based on the self-attention mechanism. Thanks to its strong representation …