[HTML][HTML] From word embeddings to pre-trained language models: A state-of-the-art walkthrough

M Mars - Applied Sciences, 2022 - mdpi.com
With the recent advances in deep learning, different approaches to improving pre-trained
language models (PLMs) have been proposed. PLMs have advanced state-of-the-art …

A survey on transformer compression

Y Tang, Y Wang, J Guo, Z Tu, K Han, H Hu… - arxiv preprint arxiv …, 2024 - arxiv.org
Transformer plays a vital role in the realms of natural language processing (NLP) and
computer vision (CV), specially for constructing large language models (LLM) and large …

Krona: Parameter efficient tuning with kronecker adapter

A Edalati, M Tahaei, I Kobyzev, VP Nia, JJ Clark… - arxiv preprint arxiv …, 2022 - arxiv.org
Fine-tuning a Pre-trained Language Model (PLM) on a specific downstream task has been a
well-known paradigm in Natural Language Processing. However, with the ever-growing size …

Beyond efficiency: A systematic survey of resource-efficient large language models

G Bai, Z Chai, C Ling, S Wang, J Lu, N Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
The burgeoning field of Large Language Models (LLMs), exemplified by sophisticated
models like OpenAI's ChatGPT, represents a significant advancement in artificial …

Lut-gemm: Quantized matrix multiplication based on luts for efficient inference in large-scale generative language models

G Park, B Park, M Kim, S Lee, J Kim, B Kwon… - arxiv preprint arxiv …, 2022 - arxiv.org
Recent advances in self-supervised learning and the Transformer architecture have
significantly improved natural language processing (NLP), achieving remarkably low …

[HTML][HTML] Information retrieval meets large language models: a strategic report from chinese ir community

Q Ai, T Bai, Z Cao, Y Chang, J Chen, Z Chen, Z Cheng… - AI Open, 2023 - Elsevier
The research field of Information Retrieval (IR) has evolved significantly, expanding beyond
traditional search to meet diverse user information needs. Recently, Large Language …

Parameter-efficient model adaptation for vision transformers

X He, C Li, P Zhang, J Yang, XE Wang - Proceedings of the AAAI …, 2023 - ojs.aaai.org
In computer vision, it has achieved great transfer learning performance via adapting large-
scale pretrained vision models (eg, vision transformers) to downstream tasks. Common …

Compression of generative pre-trained language models via quantization

C Tao, L Hou, W Zhang, L Shang, X Jiang, Q Liu… - arxiv preprint arxiv …, 2022 - arxiv.org
The increasing size of generative Pre-trained Language Models (PLMs) has greatly
increased the demand for model compression. Despite various methods to compress BERT …

A survey on model compression and acceleration for pretrained language models

C Xu, J McAuley - Proceedings of the AAAI Conference on Artificial …, 2023 - ojs.aaai.org
Despite achieving state-of-the-art performance on many NLP tasks, the high energy cost and
long inference delay prevent Transformer-based pretrained language models (PLMs) from …

What matters in the structured pruning of generative language models?

M Santacroce, Z Wen, Y Shen, Y Li - arxiv preprint arxiv:2302.03773, 2023 - arxiv.org
Auto-regressive large language models such as GPT-3 require enormous computational
resources to use. Traditionally, structured pruning methods are employed to reduce …