- Academic Search

Turnitin 降AI改写早检测系统早降重系统 Turnitin-UK版万方检测-期刊版维普编辑部版 Grammarly检测 Paperpass检测 checkpass检测 PaperYY检测

The llama 3 herd of models

A Dubey, A Jauhri, A Pandey, A Kadian… - arxiv preprint arxiv …, 2024 - arxiv.org

Modern artificial intelligence (AI) systems are powered by foundation models. This paper
presents a new set of foundation models, called Llama 3. It is a herd of language models …

Salva Cita Citato da 2691 Articoli correlati Tutte e 4 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Deepseekmath: Pushing the limits of mathematical reasoning in open language models

Z Shao, P Wang, Q Zhu, R Xu, J Song, X Bi… - arxiv preprint arxiv …, 2024 - arxiv.org

Mathematical reasoning poses a significant challenge for language models due to its
complex and structured nature. In this paper, we introduce DeepSeekMath 7B, which …

Salva Cita Citato da 234 Articoli correlati Tutte e 3 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Toward self-improvement of llms via imagination, searching, and criticizing

Y Tian, B Peng, L Song, L **, D Yu… - Advances in Neural …, 2025 - proceedings.neurips.cc

Despite the impressive capabilities of Large Language Models (LLMs) on various tasks, they
still struggle with scenarios that involves complex reasoning and planning. Self-correction …

Salva Cita Citato da 36 Articoli correlati Tutte e 4 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Recursive introspection: Teaching language model agents how to self-improve

Y Qu, T Zhang, N Garg… - Advances in Neural …, 2025 - proceedings.neurips.cc

A central piece in enabling intelligent agentic behavior in foundation models is to make them
capable of introspecting upon their behavior, reasoning, and correcting their mistakes as …

Salva Cita Citato da 15 Articoli correlati Tutte e 4 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

V-star: Training verifiers for self-taught reasoners

A Hosseini, X Yuan, N Malkin, A Courville… - arxiv preprint arxiv …, 2024 - arxiv.org

Common self-improvement approaches for large language models (LLMs), such as STaR,
iteratively fine-tune LLMs on self-generated solutions to improve their problem-solving …

Salva Cita Citato da 68 Articoli correlati Tutte e 5 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Internlm-math: Open math large language models toward verifiable reasoning

H Ying, S Zhang, L Li, Z Zhou, Y Shao, Z Fei… - arxiv preprint arxiv …, 2024 - arxiv.org

The math abilities of large language models can represent their abstract reasoning ability. In
this paper, we introduce and open-source our math reasoning LLMs InternLM-Math which is …

Salva Cita Citato da 57 Articoli correlati Tutte e 3 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Chain of preference optimization: Improving chain-of-thought reasoning in llms

X Zhang, C Du, T Pang, Q Liu… - Advances in Neural …, 2025 - proceedings.neurips.cc

The recent development of chain-of-thought (CoT) decoding has enabled large language
models (LLMs) to generate explicit logical reasoning paths for complex problem-solving …

Salva Cita Citato da 16 Articoli correlati Tutte e 8 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Easy-to-hard generalization: Scalable alignment beyond human supervision

Z Sun, L Yu, Y Shen, W Liu, Y Yang, S Welleck… - arxiv preprint arxiv …, 2024 - arxiv.org

Current AI alignment methodologies rely on human-provided demonstrations or judgments,
and the learned capabilities of AI systems would be upper-bounded by human capabilities …

Salva Cita Citato da 40 Articoli correlati Tutte e 5 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Step-dpo: Step-wise preference optimization for long-chain reasoning of llms

X Lai, Z Tian, Y Chen, S Yang, X Peng, J Jia - arxiv preprint arxiv …, 2024 - arxiv.org

Mathematical reasoning presents a significant challenge for Large Language Models
(LLMs) due to the extensive and precise chain of reasoning required for accuracy. Ensuring …

Salva Cita Citato da 32 Articoli correlati Tutte e 3 le versioni Versione HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Next: Teaching large language models to reason about code execution

A Ni, M Allamanis, A Cohan, Y Deng, K Shi… - arxiv preprint arxiv …, 2024 - arxiv.org

A fundamental skill among human developers is the ability to understand and reason about
program execution. As an example, a programmer can mentally simulate code execution in …

Salva Cita Citato da 20 Articoli correlati Tutte e 7 le versioni Versione HTML

Cita

Ricerca avanzata

Salvato in La mia biblioteca

The llama 3 herd of models

Deepseekmath: Pushing the limits of mathematical reasoning in open language models

Toward self-improvement of llms via imagination, searching, and criticizing

Recursive introspection: Teaching language model agents how to self-improve

V-star: Training verifiers for self-taught reasoners

Internlm-math: Open math large language models toward verifiable reasoning

Chain of preference optimization: Improving chain-of-thought reasoning in llms

Easy-to-hard generalization: Scalable alignment beyond human supervision

Step-dpo: Step-wise preference optimization for long-chain reasoning of llms

Next: Teaching large language models to reason about code execution