محقق Google

N Dziri, X Lu, M Sclar, XL Li, L Jiang… - Advances in …, 2023‏ - proceedings.neurips.cc‏

Transformer large language models (LLMs) have sparked admiration for their exceptional
performance on tasks that demand intricate multi-step reasoning. Yet, these models …‏

ذخیره ارجاع بیان شده در 335 یافته مقاله‌های مربوط تمام نسخه‌های 8 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Exploring length generalization in large language models‏

C Anil, Y Wu, A Andreassen… - Advances in …, 2022‏ - proceedings.neurips.cc‏

The ability to extrapolate from short problem instances to longer ones is an important form of
out-of-distribution generalization in reasoning tasks, and is crucial when learning from …‏

ذخیره ارجاع بیان شده در 207 یافته مقاله‌های مربوط تمام نسخه‌های 8 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] nature.com

A taxonomy and review of generalization research in NLP‏

D Hupkes, M Giulianelli, V Dankers, M Artetxe… - Nature Machine …, 2023‏ - nature.com‏

The ability to generalize well is one of the primary desiderata for models of natural language
processing (NLP), but what 'good generalization'entails and how it should be evaluated is …‏

ذخیره ارجاع بیان شده در 65 یافته مقاله‌های مربوط تمام نسخه‌های 13

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Foundation models for music: A survey‏

Y Ma, A Øland, A Ragni, BMS Del Sette, C Saitis… - arxiv preprint arxiv …, 2024‏ - arxiv.org‏

In recent years, foundation models (FMs) such as large language models (LLMs) and latent
diffusion models (LDMs) have profoundly impacted diverse sectors, including music. This …‏

ذخیره ارجاع بیان شده در 12 یافته مقاله‌های مربوط تمام نسخه‌های 4 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] mit.edu

Efficient methods for natural language processing: A survey‏

M Treviso, JU Lee, T Ji, B Aken, Q Cao… - Transactions of the …, 2023‏ - direct.mit.edu‏

Recent work in natural language processing (NLP) has yielded appealing results from
scaling model parameters and training data; however, using only scale to improve …‏

ذخیره ارجاع بیان شده در 112 یافته مقاله‌های مربوط تمام نسخه‌های 10

[Free GPT-4]
[DeepSeek]

[PDF] jair.org

Compositionality decomposed: How do neural networks generalise?‏

D Hupkes, V Dankers, M Mul, E Bruni - Journal of Artificial Intelligence …, 2020‏ - jair.org‏

Despite a multitude of empirical studies, little consensus exists on whether neural networks
are able to generalise compositionally, a controversy that, in part, stems from a lack of …‏

ذخیره ارجاع بیان شده در 368 یافته مقاله‌های مربوط تمام نسخه‌های 10 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Transformers can achieve length generalization but not robustly‏

Y Zhou, U Alon, X Chen, X Wang, R Agarwal… - arxiv preprint arxiv …, 2024‏ - arxiv.org‏

Length generalization, defined as the ability to extrapolate from shorter training sequences
to longer test ones, is a significant challenge for language models. This issue persists even …‏

ذخیره ارجاع بیان شده در 39 یافته مقاله‌های مربوط تمام نسخه‌های 4 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Functional interpolation for relative positions improves long context transformers‏

S Li, C You, G Guruganesh, J Ainslie… - arxiv preprint arxiv …, 2023‏ - arxiv.org‏

Preventing the performance decay of Transformers on inputs longer than those used for
training has been an important challenge in extending the context length of these models …‏

ذخیره ارجاع بیان شده در 37 یافته مقاله‌های مربوط تمام نسخه‌های 6 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

State-of-the-art generalisation research in NLP: a taxonomy and review‏

D Hupkes, M Giulianelli, V Dankers, M Artetxe… - arxiv preprint arxiv …, 2022‏ - arxiv.org‏

The ability to generalise well is one of the primary desiderata of natural language
processing (NLP). Yet, what'good generalisation'entails and how it should be evaluated is …‏

ذخیره ارجاع بیان شده در 59 یافته مقاله‌های مربوط تمام نسخه‌های 8 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Length generalization in arithmetic transformers‏

S Jelassi, S d'Ascoli, C Domingo-Enrich, Y Wu… - arxiv preprint arxiv …, 2023‏ - arxiv.org‏

We examine how transformers cope with two challenges: learning basic integer arithmetic,
and generalizing to longer sequences than seen during training. We find that relative …‏

ذخیره ارجاع بیان شده در 35 یافته مقاله‌های مربوط تمام نسخه‌های 3 نسخه HTML

ایجاد هشدار

ارجاع

جستجوی پیشرفته

در «کتابخانه من» ذخیره شد

Location attention for extrapolation to longer sequences

Faith and fate: Limits of transformers on compositionality‏

Exploring length generalization in large language models‏

A taxonomy and review of generalization research in NLP‏

Foundation models for music: A survey‏

Efficient methods for natural language processing: A survey‏

Compositionality decomposed: How do neural networks generalise?‏

Transformers can achieve length generalization but not robustly‏

Functional interpolation for relative positions improves long context transformers‏

State-of-the-art generalisation research in NLP: a taxonomy and review‏

Length generalization in arithmetic transformers‏