- Academic Search

N Gruver, M Finzi, S Qiu… - Advances in Neural …, 2023‏ - proceedings.neurips.cc‏

By encoding time series as a string of numerical digits, we can frame time series forecasting
as next-token prediction in text. Develo** this approach, we find that large language …‏

שמור צטט צוטט על ידי 355 מאמרים בנושא זה כל 6 הגרסאות פתיחה בתור HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] neurips.cc

Faith and fate: Limits of transformers on compositionality‏

N Dziri, X Lu, M Sclar, XL Li, L Jiang… - Advances in …, 2023‏ - proceedings.neurips.cc‏

Transformer large language models (LLMs) have sparked admiration for their exceptional
performance on tasks that demand intricate multi-step reasoning. Yet, these models …‏

שמור צטט צוטט על ידי 343 מאמרים בנושא זה כל 8 הגרסאות פתיחה בתור HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

Weak-to-strong generalization: Eliciting strong capabilities with weak supervision‏

C Burns, P Izmailov, JH Kirchner, B Baker… - arxiv preprint arxiv …, 2023‏ - arxiv.org‏

Widely used alignment techniques, such as reinforcement learning from human feedback
(RLHF), rely on the ability of humans to supervise model behavior-for example, to evaluate …‏

שמור צטט צוטט על ידי 224 מאמרים בנושא זה כל 9 הגרסאות פתיחה בתור HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] neurips.cc

What can transformers learn in-context? a case study of simple function classes‏

S Garg, D Tsipras, PS Liang… - Advances in Neural …, 2022‏ - proceedings.neurips.cc‏

In-context learning is the ability of a model to condition on a prompt sequence consisting of
in-context examples (input-output pairs corresponding to some task) along with a new query …‏

שמור צטט צוטט על ידי 452 מאמרים בנושא זה כל 9 הגרסאות פתיחה בתור HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

Least-to-most prompting enables complex reasoning in large language models‏

D Zhou, N Schärli, L Hou, J Wei, N Scales… - arxiv preprint arxiv …, 2022‏ - arxiv.org‏

Chain-of-thought prompting has demonstrated remarkable performance on various natural
language reasoning tasks. However, it tends to perform poorly on tasks which requires …‏

שמור צטט צוטט על ידי 1287 מאמרים בנושא זה כל 6 הגרסאות פתיחה בתור HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] neurips.cc

Exploring length generalization in large language models‏

C Anil, Y Wu, A Andreassen… - Advances in …, 2022‏ - proceedings.neurips.cc‏

The ability to extrapolate from short problem instances to longer ones is an important form of
out-of-distribution generalization in reasoning tasks, and is crucial when learning from …‏

שמור צטט צוטט על ידי 215 מאמרים בנושא זה כל 8 הגרסאות פתיחה בתור HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

Transformers learn shortcuts to automata‏

B Liu, JT Ash, S Goel, A Krishnamurthy… - arxiv preprint arxiv …, 2022‏ - arxiv.org‏

Algorithmic reasoning requires capabilities which are most naturally understood through
recurrent models of computation, like the Turing machine. However, Transformer models …‏

שמור צטט צוטט על ידי 194 מאמרים בנושא זה כל 6 הגרסאות פתיחה בתור HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] jmlr.org

Combinatorial optimization and reasoning with graph neural networks‏

Q Cappart, D Chételat, EB Khalil, A Lodi… - Journal of Machine …, 2023‏ - jmlr.org‏

Combinatorial optimization is a well-established area in operations research and computer
science. Until recently, its methods have focused on solving problem instances in isolation …‏

שמור צטט צוטט על ידי 447 מאמרים בנושא זה כל 9 הגרסאות פתיחה בתור HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

Easy-to-hard generalization: Scalable alignment beyond human supervision‏

Z Sun, L Yu, Y Shen, W Liu, Y Yang, S Welleck… - arxiv preprint arxiv …, 2024‏ - arxiv.org‏

Current AI alignment methodologies rely on human-provided demonstrations or judgments,
and the learned capabilities of AI systems would be upper-bounded by human capabilities …‏

שמור צטט צוטט על ידי 43 מאמרים בנושא זה כל 5 הגרסאות פתיחה בתור HTML

[免费ChatGPT] [DeepSeek可用网址] [PDF] arxiv.org

Transformers can achieve length generalization but not robustly‏

Y Zhou, U Alon, X Chen, X Wang, R Agarwal… - arxiv preprint arxiv …, 2024‏ - arxiv.org‏

Length generalization, defined as the ability to extrapolate from shorter training sequences
to longer test ones, is a significant challenge for language models. This issue persists even …‏

שמור צטט צוטט על ידי 41 מאמרים בנושא זה כל 4 הגרסאות פתיחה בתור HTML

יצירת התראה

צטט

חיפוש מתקדם

נשמר בספרייה שלי

Can you learn an algorithm? generalizing from easy to hard problems with recurrent networks

Large language models are zero-shot time series forecasters‏

Faith and fate: Limits of transformers on compositionality‏

Weak-to-strong generalization: Eliciting strong capabilities with weak supervision‏

What can transformers learn in-context? a case study of simple function classes‏

Least-to-most prompting enables complex reasoning in large language models‏

Exploring length generalization in large language models‏

Transformers learn shortcuts to automata‏

Combinatorial optimization and reasoning with graph neural networks‏

Easy-to-hard generalization: Scalable alignment beyond human supervision‏

Transformers can achieve length generalization but not robustly‏