- Academic Search

Y Mao, Y Ge, Y Fan, W Xu, Y Mi, Z Hu… - Frontiers of Computer …, 2025 - Springer

Abstract Low-Rank Adaptation (LoRA), which updates the dense neural network layers with
pluggable low-rank matrices, is one of the best performed parameter efficient fine-tuning …

Zapisz Cytuj Cytowane przez 17 Powiązane artykuły Wszystkie wersje 5

[Free GPT-4]

[PDF] mit.edu

`Holmes` ⌕ A Benchmark to Assess the Linguistic Competence of Language Models

A Waldis, Y Perlitz, L Choshen, Y Hou… - Transactions of the …, 2024 - direct.mit.edu

We introduce Holmes, a new benchmark designed to assess language models'(LMs')
linguistic competence—their unconscious understanding of linguistic phenomena …

Zapisz Cytuj Cytowane przez 1 Powiązane artykuły Wszystkie wersje 2

[Free GPT-4]

[PDF] arxiv.org

MoE-Pruner: Pruning Mixture-of-Experts Large Language Model using the Hints from Its Router

Y **e, Z Zhang, D Zhou, C **e, Z Song, X Liu… - arxiv preprint arxiv …, 2024 - arxiv.org

Mixture-of-Experts (MoE) architectures face challenges such as high memory consumption
and redundancy in experts. Pruning MoE can reduce network weights while maintaining …

Zapisz Cytuj Cytowane przez 2 Powiązane artykuły Wszystkie wersje 3 Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

Compress then serve: Serving thousands of lora adapters with little overhead

R Brüel-Gabrielsson, J Zhu, O Bhardwaj… - arxiv preprint arxiv …, 2024 - arxiv.org

Fine-tuning large language models (LLMs) with low-rank adaptations (LoRAs) has become
common practice, often yielding numerous copies of the same LLM differing only in their …

Zapisz Cytuj Cytowane przez 3 Powiązane artykuły Wszystkie wersje 4 Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

Asymmetry in low-rank adapters of foundation models

J Zhu, K Greenewald, K Nadjahi, HSO Borde… - arxiv preprint arxiv …, 2024 - arxiv.org

Parameter-efficient fine-tuning optimizes large, pre-trained foundation models by updating a
subset of parameters; in this class, Low-Rank Adaptation (LoRA) is particularly effective …

Zapisz Cytuj Cytowane przez 24 Powiązane artykuły Wszystkie wersje 5 Wersja HTML

Exploring Quantization Techniques for Large-Scale Language Models: Methods, Challenges and Future Directions

A Shen, Z Lai, D Li - Proceedings of the 2024 9th International …, 2024 - dl.acm.org

Breakthroughs in natural language processing (NLP) by large-scale language models
(LLMs) have led to superior performance in multilingual tasks such as translation …

Zapisz Cytuj Cytowane przez 1 Powiązane artykuły

[Free GPT-4]

[PDF] arxiv.org

Federated LoRA with Sparse Communication

K Kuo, A Raje, K Rajesh, V Smith - arxiv preprint arxiv:2406.05233, 2024 - arxiv.org

Low-rank adaptation (LoRA) is a natural method for finetuning in communication-
constrained machine learning settings such as cross-device federated learning. Prior work …

Zapisz Cytuj Cytowane przez 6 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

Lossless and Near-Lossless Compression for Foundation Models

M Hershcovitch, L Choshen, A Wood, I Enmouri… - arxiv preprint arxiv …, 2024 - arxiv.org

With the growth of model sizes and scale of their deployment, their sheer size burdens the
infrastructure requiring more network and more storage to accommodate these. While there …

Zapisz Cytuj Cytowane przez 5 Powiązane artykuły Wszystkie wersje 4 Wersja HTML

[Free GPT-4]

[PDF] openreview.net

Unforgettable Generalization in Language Models

E Zhang, L Choshen, J Andreas - First Conference on Language …, 2024 - openreview.net

When language models (LMs) are trained to``unlearn''a skill, does this unlearning
generalize? We study the behavior of LMs after fine-tuned on data for a target task (eg …

Zapisz Cytuj Powiązane artykuły Wszystkie wersje 3 Wersja HTML

[Free GPT-4]

[PDF] umontreal.ca

Towards maintainable machine learning development through continual and modular learning

O Ostapenko - 2024 - papyrus.bib.umontreal.ca

As machine learning models grow in size and complexity, their maintainability becomes a
critical concern, especially when they are increasingly deployed in dynamic, real-world …

Zapisz Cytuj Powiązane artykuły Wersja HTML

Utwórz alert

Cytuj

Szukanie zaawansowane

Zapisano w Mojej bibliotece

Compeft: Compression for communicating parameter efficient updates via sparsification and...

A survey on lora of large language models

`Holmes` ⌕ A Benchmark to Assess the Linguistic Competence of Language Models

MoE-Pruner: Pruning Mixture-of-Experts Large Language Model using the Hints from Its Router

Compress then serve: Serving thousands of lora adapters with little overhead

Asymmetry in low-rank adapters of foundation models

Exploring Quantization Techniques for Large-Scale Language Models: Methods, Challenges and Future Directions

Federated LoRA with Sparse Communication

Lossless and Near-Lossless Compression for Foundation Models

Unforgettable Generalization in Language Models

Towards maintainable machine learning development through continual and modular learning