- Academic Search

B Min, H Ross, E Sulem, APB Veyseh… - ACM Computing …, 2023 - dl.acm.org

Large, pre-trained language models (PLMs) such as BERT and GPT have drastically
changed the Natural Language Processing (NLP) field. For numerous NLP tasks …

Save Cite Cited by 1133 Related articles All 5 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

A survey on deep neural network pruning: Taxonomy, comparison, analysis, and recommendations

H Cheng, M Zhang, JQ Shi - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org

Modern deep neural networks, particularly recent large language models, come with
massive model sizes that require significant computational and storage resources. To …

Save Cite Cited by 128 Related articles All 2 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

AdaLoRA: Adaptive budget allocation for parameter-efficient fine-tuning

Q Zhang, M Chen, A Bukharin… - arxiv preprint arxiv …, 2023 - arxiv.org

Fine-tuning large pre-trained language models on downstream tasks has become an
important paradigm in NLP. However, common practice fine-tunes all of the parameters in a …

Save Cite Cited by 456 Related articles All 4 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

Losparse: Structured compression of large language models based on low-rank and sparse approximation

Y Li, Y Yu, Q Zhang, C Liang, P He… - International …, 2023 - proceedings.mlr.press

Transformer models have achieved remarkable results in various natural language tasks,
but they are often prohibitively large, requiring massive memories and computational …

Save Cite Cited by 74 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

Revisiting out-of-distribution robustness in nlp: Benchmarks, analysis, and LLMs evaluations

L Yuan, Y Chen, G Cui, H Gao, F Zou… - Advances in …, 2023 - proceedings.neurips.cc

This paper reexamines the research on out-of-distribution (OOD) robustness in the field of
NLP. We find that the distribution shift settings in previous studies commonly lack adequate …

Save Cite Cited by 76 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

Task-specific skill localization in fine-tuned language models

A Panigrahi, N Saunshi, H Zhao… - … on Machine Learning, 2023 - proceedings.mlr.press

Pre-trained language models can be fine-tuned to solve diverse NLP tasks, including in few-
shot settings. Thus fine-tuning allows the model to quickly pick up task-specific" skills," but …

Save Cite Cited by 67 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

Platon: Pruning large transformer models with upper confidence bound of weight importance

Q Zhang, S Zuo, C Liang, A Bukharin… - International …, 2022 - proceedings.mlr.press

Large Transformer-based models have exhibited superior performance in various natural
language processing and computer vision tasks. However, these models contain enormous …

Save Cite Cited by 90 Related articles All 4 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

State-of-the-art generalisation research in NLP: a taxonomy and review

D Hupkes, M Giulianelli, V Dankers, M Artetxe… - arxiv preprint arxiv …, 2022 - arxiv.org

The ability to generalise well is one of the primary desiderata of natural language
processing (NLP). Yet, what'good generalisation'entails and how it should be evaluated is …

Save Cite Cited by 61 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Edge AI: A taxonomy, systematic review and future directions

SS Gill, M Golec, J Hu, M Xu, J Du, H Wu, GK Walia… - Cluster …, 2025 - Springer

Abstract Edge Artificial Intelligence (AI) incorporates a network of interconnected systems
and devices that receive, cache, process, and analyse data in close communication with the …

Save Cite Cited by 15 Related articles All 10 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Configurable foundation models: Building llms from a modular perspective

C **ao, Z Zhang, C Song, D Jiang, F Yao, X Han… - arxiv preprint arxiv …, 2024 - arxiv.org

Advancements in LLMs have recently unveiled challenges tied to computational efficiency
and continual scalability due to their requirements of huge parameters, making the …

Save Cite Cited by 9 Related articles All 3 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

Super tickets in pre-trained language models: From model compression to improving generalization

Recent advances in natural language processing via large pre-trained language models: A survey

A survey on deep neural network pruning: Taxonomy, comparison, analysis, and recommendations

AdaLoRA: Adaptive budget allocation for parameter-efficient fine-tuning

Losparse: Structured compression of large language models based on low-rank and sparse approximation

Revisiting out-of-distribution robustness in nlp: Benchmarks, analysis, and LLMs evaluations

Task-specific skill localization in fine-tuned language models

Platon: Pruning large transformer models with upper confidence bound of weight importance

State-of-the-art generalisation research in NLP: a taxonomy and review

Edge AI: A taxonomy, systematic review and future directions

Configurable foundation models: Building llms from a modular perspective