- Academic Search

RY Pang, W Yuan, H He, K Cho… - Advances in …, 2025 - proceedings.neurips.cc

Iterative preference optimization methods have recently been shown to perform well for
general instruction tuning tasks, but typically make little improvement on reasoning tasks. In …

Zapisz Cytuj Cytowane przez 80 Powiązane artykuły Wszystkie wersje 3 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Datasets for large language models: A comprehensive survey

Y Liu, J Cao, C Liu, K Ding, L ** - arxiv preprint arxiv:2402.18041, 2024 - arxiv.org

This paper embarks on an exploration into the Large Language Model (LLM) datasets,
which play a crucial role in the remarkable advancements of LLMs. The datasets serve as …

Zapisz Cytuj Cytowane przez 62 Powiązane artykuły Wszystkie wersje 4 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Recursive introspection: Teaching language model agents how to self-improve

Y Qu, T Zhang, N Garg… - Advances in Neural …, 2025 - proceedings.neurips.cc

A central piece in enabling intelligent agentic behavior in foundation models is to make them
capable of introspecting upon their behavior, reasoning, and correcting their mistakes as …

Zapisz Cytuj Cytowane przez 13 Powiązane artykuły Wszystkie wersje 4 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Step-dpo: Step-wise preference optimization for long-chain reasoning of llms

X Lai, Z Tian, Y Chen, S Yang, X Peng, J Jia - arxiv preprint arxiv …, 2024 - arxiv.org

Mathematical reasoning presents a significant challenge for Large Language Models
(LLMs) due to the extensive and precise chain of reasoning required for accuracy. Ensuring …

Zapisz Cytuj Cytowane przez 31 Powiązane artykuły Wszystkie wersje 3 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Llama-berry: Pairwise optimization for o1-like olympiad-level mathematical reasoning

D Zhang, J Wu, J Lei, T Che, J Li, T **e… - arxiv preprint arxiv …, 2024 - arxiv.org

This paper presents an advanced mathematical problem-solving framework, LLaMA-Berry,
for enhancing the mathematical reasoning ability of Large Language Models (LLMs). The …

Zapisz Cytuj Cytowane przez 22 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Dart-math: Difficulty-aware rejection tuning for mathematical problem-solving

Y Tong, X Zhang, R Wang, R Wu, J He - arxiv preprint arxiv:2407.13690, 2024 - arxiv.org

Solving mathematical problems requires advanced reasoning abilities and presents notable
challenges for large language models. Previous works usually synthesize data from …

Zapisz Cytuj Cytowane przez 23 Powiązane artykuły Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Building math agents with multi-turn iterative preference learning

W **ong, C Shi, J Shen, A Rosenberg, Z Qin… - arxiv preprint arxiv …, 2024 - arxiv.org

Recent studies have shown that large language models'(LLMs) mathematical problem-
solving capabilities can be enhanced by integrating external tools, such as code …

Zapisz Cytuj Cytowane przez 12 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Openmathinstruct-2: Accelerating ai for math with massive open-source instruction data

S Toshniwal, W Du, I Moshkov, B Kisacanin… - arxiv preprint arxiv …, 2024 - arxiv.org

Mathematical reasoning continues to be a critical challenge in large language model (LLM)
development with significant interest. However, most of the cutting-edge progress in …

Zapisz Cytuj Cytowane przez 14 Powiązane artykuły Wszystkie wersje 3 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Ai-assisted generation of difficult math questions

V Shah, D Yu, K Lyu, S Park, J Yu, Y He, NR Ke… - arxiv preprint arxiv …, 2024 - arxiv.org

Current LLM training positions mathematical reasoning as a core capability. With publicly
available sources fully tapped, there is unmet demand for diverse and challenging math …

Zapisz Cytuj Cytowane przez 8 Powiązane artykuły Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Bluelm-v-3b: Algorithm and system co-design for multimodal large language models on mobile devices

X Lu, Y Chen, C Chen, H Tan, B Chen, Y **e… - arxiv preprint arxiv …, 2024 - arxiv.org

The emergence and growing popularity of multimodal large language models (MLLMs) have
significant potential to enhance various aspects of daily life, from improving communication …

Zapisz Cytuj Cytowane przez 4 Powiązane artykuły Wszystkie wersje 2 Wersja HTML

Utwórz alert

Cytuj

Szukanie zaawansowane

Zapisano w Mojej bibliotece

Openmathinstruct-1: A 1.8 million math instruction tuning dataset

Iterative reasoning preference optimization

Datasets for large language models: A comprehensive survey

Recursive introspection: Teaching language model agents how to self-improve

Step-dpo: Step-wise preference optimization for long-chain reasoning of llms

Llama-berry: Pairwise optimization for o1-like olympiad-level mathematical reasoning

Dart-math: Difficulty-aware rejection tuning for mathematical problem-solving

Building math agents with multi-turn iterative preference learning

Openmathinstruct-2: Accelerating ai for math with massive open-source instruction data

Ai-assisted generation of difficult math questions

Bluelm-v-3b: Algorithm and system co-design for multimodal large language models on mobile devices