- Academic Search

T Wu, L Luo, YF Li, S Pan, TT Vu, G Haffari - arxiv preprint arxiv …, 2024 - arxiv.org

Large language models (LLMs) are not amenable to frequent re-training, due to high
training costs arising from their massive scale. However, updates are necessary to endow …

Zapisz Cytuj Cytowane przez 104 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

Recent advances of foundation language models-based continual learning: A survey

Y Yang, J Zhou, X Ding, T Huai, S Liu, Q Chen… - ACM Computing …, 2025 - dl.acm.org

Recently, foundation language models (LMs) have marked significant achievements in the
domains of natural language processing and computer vision. Unlike traditional neural …

Zapisz Cytuj Cytowane przez 6 Powiązane artykuły Wszystkie wersje 2

[Free GPT-4]

[PDF] neurips.cc

Ties-merging: Resolving interference when merging models

P Yadav, D Tam, L Choshen… - Advances in Neural …, 2024 - proceedings.neurips.cc

Transfer learning–ie, further fine-tuning a pre-trained model on a downstream task–can
confer significant advantages, including improved downstream performance, faster …

Zapisz Cytuj Cytowane przez 198 Powiązane artykuły Wszystkie wersje 7 Wersja HTML

[Free GPT-4]

[PDF] acm.org

Exploring parameter-efficient fine-tuning techniques for code generation with large language models

M Weyssow, X Zhou, K Kim, D Lo… - ACM Transactions on …, 2023 - dl.acm.org

Large language models (LLMs) demonstrate impressive capabilities to generate accurate
code snippets given natural language intents in a zero-shot manner, ie, without the need for …

Zapisz Cytuj Cytowane przez 62 Powiązane artykuły Wszystkie wersje 3

[Free GPT-4]

[PDF] arxiv.org

A survey of large language models for code: Evolution, benchmarking, and future trends

Z Zheng, K Ning, Y Wang, J Zhang, D Zheng… - arxiv preprint arxiv …, 2023 - arxiv.org

General large language models (LLMs), represented by ChatGPT, have demonstrated
significant potential in tasks such as code generation in software engineering. This has led …

Zapisz Cytuj Cytowane przez 81 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

Reuse, Don't Retrain: A Recipe for Continued Pretraining of Language Models

J Parmar, S Satheesh, M Patwary, M Shoeybi… - arxiv preprint arxiv …, 2024 - arxiv.org

As language models have scaled both their number of parameters and pretraining dataset
sizes, the computational cost for pretraining has become intractable except for the most well …

Zapisz Cytuj Cytowane przez 10 Powiązane artykuły Wszystkie wersje 4 Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

Continual learning with pre-trained models: A survey

DW Zhou, HL Sun, J Ning, HJ Ye, DC Zhan - arxiv preprint arxiv …, 2024 - arxiv.org

Nowadays, real-world applications often face streaming data, which requires the learning
system to absorb new knowledge as data evolves. Continual Learning (CL) aims to achieve …

Zapisz Cytuj Cytowane przez 53 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

Simple and scalable strategies to continually pre-train large language models

A Ibrahim, B Thérien, K Gupta, ML Richter… - arxiv preprint arxiv …, 2024 - arxiv.org

Large language models (LLMs) are routinely pre-trained on billions of tokens, only to start
the process over again once new data becomes available. A much more efficient solution is …

Zapisz Cytuj Cytowane przez 55 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]

[PDF] arxiv.org

When to stop? towards efficient code generation in llms with excess token prevention

L Guo, Y Wang, E Shi, W Zhong, H Zhang… - Proceedings of the 33rd …, 2024 - dl.acm.org

Code generation aims to automatically generate code snippets that meet given natural
language requirements and plays an important role in software development. Although …

Zapisz Cytuj Cytowane przez 6 Powiązane artykuły Wszystkie wersje 4

[Free GPT-4]

[PDF] arxiv.org

What Matters for Model Merging at Scale?

P Yadav, T Vu, J Lai, A Chronopoulou… - arxiv preprint arxiv …, 2024 - arxiv.org

Model merging aims to combine multiple expert models into a more capable single model,
offering benefits such as reduced storage and serving costs, improved generalization, and …

Zapisz Cytuj Cytowane przez 6 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

Utwórz alert

Cytuj

Szukanie zaawansowane

Zapisano w Mojej bibliotece

Exploring continual learning for code generation models

Continual learning for large language models: A survey

Recent advances of foundation language models-based continual learning: A survey

Ties-merging: Resolving interference when merging models

Exploring parameter-efficient fine-tuning techniques for code generation with large language models

A survey of large language models for code: Evolution, benchmarking, and future trends

Reuse, Don't Retrain: A Recipe for Continued Pretraining of Language Models

Continual learning with pre-trained models: A survey

Simple and scalable strategies to continually pre-train large language models

When to stop? towards efficient code generation in llms with excess token prevention

What Matters for Model Merging at Scale?