- Academic Search

A Yang, B Zhang, B Hui, B Gao, B Yu, C Li… - arxiv preprint arxiv …, 2024 - arxiv.org

In this report, we present a series of math-specific large language models: Qwen2. 5-Math
and Qwen2. 5-Math-Instruct-1.5 B/7B/72B. The core innovation of the Qwen2. 5 series lies in …

Opslaan Citeren Geciteerd door 47 Verwante artikelen Alle 2 versies HTML-versie

[Free GPT-4]

[PDF] arxiv.org

Step-dpo: Step-wise preference optimization for long-chain reasoning of llms

X Lai, Z Tian, Y Chen, S Yang, X Peng, J Jia - arxiv preprint arxiv …, 2024 - arxiv.org

Mathematical reasoning presents a significant challenge for Large Language Models
(LLMs) due to the extensive and precise chain of reasoning required for accuracy. Ensuring …

Opslaan Citeren Geciteerd door 29 Verwante artikelen Alle 3 versies HTML-versie

[Free GPT-4]

[PDF] pku.edu.cn

[PDF][PDF] Numinamath: The largest public dataset in ai4maths with 860k pairs of competition math problems and solutions

J Li, E Beeching, L Tunstall, B Lipkin… - Hugging Face …, 2024 - faculty.bicmr.pku.edu.cn

Numina is an open AI4Maths initiative dedicated to advancing both artificial and human
intelligence in the field of mathematics. In this paper, we present the NuminaMath dataset, a …

Opslaan Citeren Geciteerd door 11 Verwante artikelen HTML-versie

[Free GPT-4]

[PDF] arxiv.org

Rethinking data selection at scale: Random selection is almost all you need

T **a, B Yu, K Dang, A Yang, Y Wu, Y Tian… - arxiv preprint arxiv …, 2024 - arxiv.org

Supervised fine-tuning (SFT) is crucial for aligning Large Language Models (LLMs) with
human instructions. The primary goal during SFT is to select a small yet representative …

Opslaan Citeren Geciteerd door 3 Verwante artikelen Alle 2 versies HTML-versie

[Free GPT-4]

[PDF] arxiv.org

Sc-Math: Spontaneous Step-level Self-correction Makes Large Language Models Better Mathematical Reasoners

Y Yan, J Jiang, Y Liu, Y Cao, X Xu, X Cai… - arxiv preprint arxiv …, 2024 - arxiv.org

Self-correction is a novel method that can stimulate the potential reasoning abilities of large
language models (LLMs). It involves detecting and correcting errors during the inference …

Opslaan Citeren Geciteerd door 3 Verwante artikelen Alle 2 versies HTML-versie

[Free GPT-4]

[PDF] arxiv.org

ProVision: Programmatically Scaling Vision-centric Instruction Data for Multimodal Language Models

J Zhang, L Xue, L Song, J Wang, W Huang… - arxiv preprint arxiv …, 2024 - arxiv.org

With the rise of multimodal applications, instruction data has become critical for training
multimodal language models capable of understanding complex image-based queries …

Opslaan Citeren Geciteerd door 1 Verwante artikelen Alle 2 versies HTML-versie

[Free GPT-4]

[PDF] arxiv.org

Towards Effective and Efficient Continual Pre-training of Large Language Models

J Chen, Z Chen, J Wang, K Zhou, Y Zhu, J Jiang… - arxiv preprint arxiv …, 2024 - arxiv.org

Continual pre-training (CPT) has been an important approach for adapting language models
to specific domains or tasks. To make the CPT approach more traceable, this paper presents …

Opslaan Citeren Geciteerd door 1 Verwante artikelen Alle 2 versies HTML-versie

[Free GPT-4]

[PDF] arxiv.org

Technical report: Enhancing llm reasoning with reward-guided tree search

J Jiang, Z Chen, Y Min, J Chen, X Cheng… - arxiv preprint arxiv …, 2024 - arxiv.org

Recently, test-time scaling has garnered significant attention from the research community,
largely due to the substantial advancements of the o1 model released by OpenAI. By …

Opslaan Citeren Geciteerd door 1 Verwante artikelen Alle 2 versies HTML-versie

[Free GPT-4]

[PDF] arxiv.org

Mix-cpt: A domain adaptation framework via decoupling knowledge learning and format alignment

J Jiang, J Li, WX Zhao, Y Song, T Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org

Adapting general large language models (LLMs) to specialized domains presents great
challenges due to varied data distributions. This adaptation typically requires continual pre …

Opslaan Citeren Geciteerd door 1 Verwante artikelen Alle 2 versies HTML-versie

[Free GPT-4]

[PDF] aclanthology.org

Not Everything is All You Need: Toward Low-Redundant Optimization for Large Language Model Alignment

Z Chen, K Zhou, WX Zhao, J Wang… - Proceedings of the 2024 …, 2024 - aclanthology.org

Large language models (LLMs) are still struggling in aligning with human preference in
complex tasks and scenarios. They are prone to overfit into the unexpected patterns or …

Opslaan Citeren Verwante artikelen HTML-versie

Melding maken

Citeren

Geavanceerd zoeken

Opgeslagen in Mijn bibliotheek

JiuZhang3. 0: Efficiently Improving Mathematical Reasoning by Training Small Data Synthesis Models

Qwen2. 5-math technical report: Toward mathematical expert model via self-improvement

Step-dpo: Step-wise preference optimization for long-chain reasoning of llms

[PDF][PDF] Numinamath: The largest public dataset in ai4maths with 860k pairs of competition math problems and solutions

Rethinking data selection at scale: Random selection is almost all you need

Sc-Math: Spontaneous Step-level Self-correction Makes Large Language Models Better Mathematical Reasoners

ProVision: Programmatically Scaling Vision-centric Instruction Data for Multimodal Language Models

Towards Effective and Efficient Continual Pre-training of Large Language Models

Technical report: Enhancing llm reasoning with reward-guided tree search

Mix-cpt: A domain adaptation framework via decoupling knowledge learning and format alignment

Not Everything is All You Need: Toward Low-Redundant Optimization for Large Language Model Alignment