LISA: layerwise importance sampling for memory-efficient large language model fine-tuning

R Pan, X Liu, S Diao, R Pi, J Zhang… - Advances in Neural …, 2025 - proceedings.neurips.cc
The machine learning community has witnessed impressive advancements since large
language models (LLMs) first appeared. Yet, their massive memory consumption has …

Large language model inference acceleration: A comprehensive hardware perspective

J Li, J Xu, S Huang, Y Chen, W Li, J Liu, Y Lian… - arxiv preprint arxiv …, 2024 - arxiv.org
Large Language Models (LLMs) have demonstrated remarkable capabilities across various
fields, from natural language understanding to text generation. Compared to non-generative …

Discovering sparsity allocation for layer-wise pruning of large language models

L Li, P Dong, Z Tang, X Liu, Q Wang… - Advances in …, 2025 - proceedings.neurips.cc
In this paper, we present DSA, the first automated framework for discovering sparsity
allocation schemes for layer-wise pruning in Large Language Models (LLMs). LLMs have …

Defending large language models against jailbreak attacks via layer-specific editing

W Zhao, Z Li, Y Li, Y Zhang, J Sun - arxiv preprint arxiv:2405.18166, 2024 - arxiv.org
Large language models (LLMs) are increasingly being adopted in a wide range of real-
world applications. Despite their impressive performance, recent studies have shown that …

Mobile-bench: An evaluation benchmark for llm-based mobile agents

S Deng, W Xu, H Sun, W Liu, T Tan, J Liu, A Li… - arxiv preprint arxiv …, 2024 - arxiv.org
With the remarkable advancements of large language models (LLMs), LLM-based agents
have become a research hotspot in human-computer interaction. However, there is a …

D-llm: A token adaptive computing resource allocation strategy for large language models

Y Jiang, H Wang, L **e, H Zhao… - Advances in Neural …, 2025 - proceedings.neurips.cc
Large language models have shown an impressive societal impact owing to their excellent
understanding and logical reasoning skills. However, such strong ability relies on a huge …

Parenting: Optimizing knowledge selection of retrieval-augmented language models with parameter decoupling and tailored tuning

Y Xu, R Zhang, X Jiang, Y Feng, Y **ao, X Ma… - arxiv preprint arxiv …, 2024 - arxiv.org
Retrieval-Augmented Generation (RAG) offers an effective solution to the issues faced by
Large Language Models (LLMs) in hallucination generation and knowledge obsolescence …

Structured Optimal Brain Pruning for Large Language Models

J Wei, Q Lu, N Jiang, S Li, J **ang… - Proceedings of the …, 2024 - aclanthology.org
The massive parameters and computational demands hinder the widespread application of
Large Language Models (LLMs). Network pruning provides a practical solution to this …

Watermarking Large Language Models and the Generated Content: Opportunities and Challenges

R Zhang, F Koushanfar - arxiv preprint arxiv:2410.19096, 2024 - arxiv.org
The widely adopted and powerful generative large language models (LLMs) have raised
concerns about intellectual property rights violations and the spread of machine-generated …

Multimodal contrastive in-context learning

Y Miyanishi, ML Nguyen - arxiv preprint arxiv:2408.12959, 2024 - arxiv.org
The rapid growth of Large Language Models (LLMs) usage has highlighted the importance
of gradient-free in-context learning (ICL). However, interpreting their inner workings remains …