Google Академія

N Muennighoff, T Wang, L Sutawika, A Roberts… - arxiv preprint arxiv …, 2022 - arxiv.org

Multitask prompted finetuning (MTF) has been shown to help large language models
generalize to new tasks in a zero-shot setting, but so far explorations of MTF have focused …

Зберегти Послатися Цитовано в 709 джерелах Пов’язані статті Кількість версій: 6 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Pretraining language models with human preferences

T Korbak, K Shi, A Chen, RV Bhalerao… - International …, 2023 - proceedings.mlr.press

Abstract Language models (LMs) are pretrained to imitate text from large and diverse
datasets that contain content that would violate human preferences if generated by an LM …

Зберегти Послатися Цитовано в 200 джерелах Пов’язані статті Кількість версій: 11 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Modular deep learning

J Pfeiffer, S Ruder, I Vulić, EM Ponti - arxiv preprint arxiv:2302.11529, 2023 - arxiv.org

Transfer learning has recently become the dominant paradigm of machine learning. Pre-
trained models fine-tuned for downstream tasks achieve better performance with fewer …

Зберегти Послатися Цитовано в 118 джерелах Пов’язані статті Кількість версій: 4 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

Lora learns less and forgets less

D Biderman, J Portes, JJG Ortiz, M Paul… - … on Machine Learning …, 2024 - openreview.net

Low-Rank Adaptation (LoRA) is a widely-used parameter-efficient finetuning method for
large language models. LoRA saves memory by training only low rank perturbations to …

Зберегти Послатися Цитовано в 100 джерелах Пов’язані статті Кількість версій: 4 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Conditional adapters: Parameter-efficient transfer learning with fast inference

T Lei, J Bai, S Brahma, J Ainslie… - Advances in …, 2023 - proceedings.neurips.cc

Abstract We propose Conditional Adapter (CoDA), a parameter-efficient transfer learning
method that also improves inference efficiency. CoDA generalizes beyond standard adapter …

Зберегти Послатися Цитовано в 55 джерелах Пов’язані статті Кількість версій: 6 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Multilingual large language model: A survey of resources, taxonomy and frontiers

L Qin, Q Chen, Y Zhou, Z Chen, Y Li, L Liao… - arxiv preprint arxiv …, 2024 - arxiv.org

Multilingual Large Language Models are capable of using powerful Large Language
Models to handle and respond to queries in multiple languages, which achieves remarkable …

Зберегти Послатися Цитовано в 59 джерелах Пов’язані статті Кількість версій: 2 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Slm: Bridge the thin gap between speech and text foundation models

M Wang, W Han, I Shafran, Z Wu… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org

We present a joint Speech and Language Model (SLM), a multitask, multilingual, and dual-
modal model that takes advantage of pretrained foundational speech and language models …

Зберегти Послатися Цитовано в 41 джерелах Пов’язані статті Кількість версій: 4

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Understanding and mitigating language confusion in llms

K Marchisio, WY Ko, A Bérard, T Dehaze… - arxiv preprint arxiv …, 2024 - arxiv.org

We investigate a surprising limitation of LLMs: their inability to consistently generate text in a
user's desired language. We create the Language Confusion Benchmark (LCB) to evaluate …

Зберегти Послатися Цитовано в 10 джерелах Пов’язані статті Кількість версій: 3 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

PrivacyMind: large language models can be contextual privacy protection learners

Y **ao, Y **, Y Bai, Y Wu, X Yang, X Luo, W Yu… - arxiv preprint arxiv …, 2023 - arxiv.org

The proliferation of Large Language Models (LLMs) has driven considerable interest in fine-
tuning them with domain-specific data to create specialized language models. Nevertheless …

Зберегти Послатися Цитовано в 22 джерелах Пов’язані статті Кількість версій: 5 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] mit.edu

QAmeleon: Multilingual QA with Only 5 Examples

P Agrawal, C Alberti, F Huot, J Maynez, J Ma… - Transactions of the …, 2023 - direct.mit.edu

The availability of large, high-quality datasets has been a major driver of recent progress in
question answering (QA). Such annotated datasets, however, are difficult and costly to …

Зберегти Послатися Цитовано в 29 джерелах Пов’язані статті Кількість версій: 10

Створити сповіщення

Послатися

Розширений пошук

Збережено в моїй бібліотеці

Overcoming catastrophic forgetting in zero-shot cross-lingual generation

Crosslingual generalization through multitask finetuning

Pretraining language models with human preferences

Modular deep learning

Lora learns less and forgets less

Conditional adapters: Parameter-efficient transfer learning with fast inference

Multilingual large language model: A survey of resources, taxonomy and frontiers

Slm: Bridge the thin gap between speech and text foundation models

Understanding and mitigating language confusion in llms

PrivacyMind: large language models can be contextual privacy protection learners

QAmeleon: Multilingual QA with Only 5 Examples