Memory-efficient fine-tuning of compressed large language models via sub-4-bit integer quantization

J Kim, JH Lee, S Kim, J Park, KM Yoo… - Advances in Neural …, 2024 - proceedings.neurips.cc
Large language models (LLMs) face the challenges in fine-tuning and deployment due to
their high memory demands and computational costs. While parameter-efficient fine-tuning …

Efficientqat: Efficient quantization-aware training for large language models

M Chen, W Shao, P Xu, J Wang, P Gao… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models (LLMs) are crucial in modern natural language processing and
artificial intelligence. However, they face challenges in managing their significant memory …

Nearest is not dearest: Towards practical defense against quantization-conditioned backdoor attacks

B Li, Y Cai, H Li, F Xue, Z Li… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Abstract Model quantization is widely used to compress and accelerate deep neural
networks. However recent studies have revealed the feasibility of weaponizing model …

Beyond efficiency: A systematic survey of resource-efficient large language models

G Bai, Z Chai, C Ling, S Wang, J Lu, N Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
The burgeoning field of Large Language Models (LLMs), exemplified by sophisticated
models like OpenAI's ChatGPT, represents a significant advancement in artificial …

A survey of resource-efficient llm and multimodal foundation models

M Xu, W Yin, D Cai, R Yi, D Xu, Q Wang, B Wu… - arxiv preprint arxiv …, 2024 - arxiv.org
Large foundation models, including large language models (LLMs), vision transformers
(ViTs), diffusion, and LLM-based multimodal models, are revolutionizing the entire machine …

A survey of low-bit large language models: Basics, systems, and algorithms

R Gong, Y Ding, Z Wang, C Lv, X Zheng, J Du… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models (LLMs) have achieved remarkable advancements in natural
language processing, showcasing exceptional performance across various tasks. However …

Resource-efficient Algorithms and Systems of Foundation Models: A Survey

M Xu, D Cai, W Yin, S Wang, X **, X Liu - ACM Computing Surveys, 2025 - dl.acm.org
Large foundation models, including large language models, vision transformers, diffusion,
and large language model based multimodal models, are revolutionizing the entire machine …

PTQ4SAM: Post-Training Quantization for Segment Anything

C Lv, H Chen, J Guo, Y Ding… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Abstract Segment Anything Model (SAM) has achieved impressive performance in many
computer vision tasks. However as a large-scale model the immense memory and …

Optimize weight rounding via signed gradient descent for the quantization of llms

W Cheng, W Zhang, H Shen, Y Cai, X He, K Lv… - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs) have proven their exceptional capabilities in performing
language-related tasks. However, their deployment poses significant challenges due to their …

[PDF][PDF] Resource-efficient algorithms and systems of foundation models: A survey

M Xu, D Cai, W Yin, S Wang, X **, X Liu - ACM Comput. Surv., 2024 - xumengwei.github.io
In the rapidly evolving field of artificial intelligence (AI), a paradigm shift is underway. We are
witnessing the transition from specialized, fragmented deep learning models to versatile …