Google Академія

A Albalak, Y Elazar, SM **e, S Longpre… - arxiv preprint arxiv …, 2024 - arxiv.org

A major factor in the recent success of large language models is the use of enormous and
ever-growing text datasets for unsupervised pre-training. However, naively training a model …

Зберегти Послатися Цитовано в 86 джерелах Пов’язані статті Кількість версій: 3 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

A survey of text classification with transformers: How wide? how large? how long? how accurate? how expensive? how safe?

J Fields, K Chovanec, P Madiraju - IEEE Access, 2024 - ieeexplore.ieee.org

Text classification in natural language processing (NLP) is evolving rapidly, particularly with
the surge in transformer-based models, including large language models (LLM). This paper …

Зберегти Послатися Цитовано в 54 джерелах Пов’язані статті Кількість версій: 2

[Free GPT-4]
[DeepSeek]

[PDF] zhjwpku.com

[PDF][PDF] A survey of large language models

WX Zhao, K Zhou, J Li, T Tang… - arxiv preprint arxiv …, 2023 - paper-notes.zhjwpku.com

Ever since the Turing Test was proposed in the 1950s, humans have explored the mastering
of language intelligence by machine. Language is essentially a complex, intricate system of …

Зберегти Послатися Цитовано в 3815 джерелах Пов’язані статті Кількість версій: 6 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

C-pack: Packed resources for general chinese embeddings

S **ao, Z Liu, P Zhang, N Muennighoff, D Lian… - Proceedings of the 47th …, 2024 - dl.acm.org

We introduce C-Pack, a package of resources that significantly advances the field of general
text embeddings for Chinese. C-Pack includes three critical resources. 1) C-MTP is a …

Зберегти Послатися Цитовано в 461 джерелах Пов’язані статті Кількість версій: 5

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Phi-3 technical report: A highly capable language model locally on your phone

M Abdin, J Aneja, H Awadalla, A Awadallah… - arxiv preprint arxiv …, 2024 - arxiv.org

We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion
tokens, whose overall performance, as measured by both academic benchmarks and …

Зберегти Послатися Цитовано в 881 джерелах Пов’язані статті Кількість версій: 3 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Textbooks are all you need

S Gunasekar, Y Zhang, J Aneja, CCT Mendes… - arxiv preprint arxiv …, 2023 - arxiv.org

We introduce phi-1, a new large language model for code, with significantly smaller size
than competing models: phi-1 is a Transformer-based model with 1.3 B parameters, trained …

Зберегти Послатися Цитовано в 575 джерелах Пов’язані статті Кількість версій: 6 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Rwkv: Reinventing rnns for the transformer era

B Peng, E Alcaide, Q Anthony, A Albalak… - arxiv preprint arxiv …, 2023 - arxiv.org

Transformers have revolutionized almost all natural language processing (NLP) tasks but
suffer from memory and computational complexity that scales quadratically with sequence …

Зберегти Послатися Цитовано в 481 джерелах Пов’язані статті Кількість версій: 9 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] esserepensiero.it

[PDF][PDF] Scalable extraction of training data from (production) language models

M Nasr, N Carlini, J Hayase, M Jagielski… - arxiv preprint arxiv …, 2023 - esserepensiero.it

This paper studies extractable memorization: training data that an adversary can efficiently
extract by querying a machine learning model without prior knowledge of the training …

Зберегти Послатися Цитовано в 314 джерелах Пов’язані статті Кількість версій: 10 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Starcoder 2 and the stack v2: The next generation

A Lozhkov, R Li, LB Allal, F Cassano… - arxiv preprint arxiv …, 2024 - arxiv.org

The BigCode project, an open-scientific collaboration focused on the responsible
development of Large Language Models for Code (Code LLMs), introduces StarCoder2. In …

Зберегти Послатися Цитовано в 213 джерелах Пов’язані статті Кількість версій: 2 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Crosslingual generalization through multitask finetuning

N Muennighoff, T Wang, L Sutawika, A Roberts… - arxiv preprint arxiv …, 2022 - arxiv.org

Multitask prompted finetuning (MTF) has been shown to help large language models
generalize to new tasks in a zero-shot setting, but so far explorations of MTF have focused …

Зберегти Послатися Цитовано в 713 джерелах Пов’язані статті Кількість версій: 6 Показати у форматі HTML

Створити сповіщення

Послатися

Розширений пошук

Збережено в моїй бібліотеці

Scaling data-constrained language models

A survey on data selection for language models

A survey of text classification with transformers: How wide? how large? how long? how accurate? how expensive? how safe?

[PDF][PDF] A survey of large language models

C-pack: Packed resources for general chinese embeddings

Phi-3 technical report: A highly capable language model locally on your phone

Textbooks are all you need

Rwkv: Reinventing rnns for the transformer era

[PDF][PDF] Scalable extraction of training data from (production) language models

Starcoder 2 and the stack v2: The next generation

Crosslingual generalization through multitask finetuning