Google Učenjak

Relating the Seemingly Unrelated: Principled Understanding of Generalization for Generative Models in Arithmetic Reasoning Tasks

X Xu, Z Zhao, H Zhang, Y Yang - arxiv preprint arxiv:2407.17963, 2024 - arxiv.org

Large language models (LLMs) have demonstrated impressive versatility across numerous
tasks, yet their generalization capabilities remain poorly understood. To investigate these …

Shrani Navedi Navedeno v 1 virih Sorodni članki Vse različice: 2 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Beyond In-Distribution Success: Scaling Curves of CoT Granularity for Language Model Generalization

R Wang, W Huang, S Song, H Zhang… - arxiv preprint arxiv …, 2025 - arxiv.org

Generalization to novel compound tasks under distribution shift is important for deploying
transformer-based language models (LMs). This work investigates Chain-of-Thought (CoT) …

Shrani Navedi Sorodni članki V obliki HTML

Ustvari opozorilo

Navedi

Napredno iskanje

Shranjeno v Mojo knjižnico

It ain't that bad: understanding the mysterious performance drop in OOD generalization for...

Relating the Seemingly Unrelated: Principled Understanding of Generalization for Generative Models in Arithmetic Reasoning Tasks

Beyond In-Distribution Success: Scaling Curves of CoT Granularity for Language Model Generalization