Deepseekmath: Pushing the limits of mathematical reasoning in open language models

Z Shao, P Wang, Q Zhu, R Xu, J Song, X Bi… - arxiv preprint arxiv …, 2024 - arxiv.org
Mathematical reasoning poses a significant challenge for language models due to its
complex and structured nature. In this paper, we introduce DeepSeekMath 7B, which …

Math-shepherd: Verify and reinforce llms step-by-step without human annotations

P Wang, L Li, Z Shao, R Xu, D Dai, Y Li… - Proceedings of the …, 2024 - aclanthology.org
In this paper, we present an innovative process-oriented math process reward model called
Math-shepherd, which assigns a reward score to each step of math problem solutions. The …

Metamath: Bootstrap your own mathematical questions for large language models

L Yu, W Jiang, H Shi, J Yu, Z Liu, Y Zhang… - arxiv preprint arxiv …, 2023 - arxiv.org
Large language models (LLMs) have pushed the limits of natural language understanding
and exhibited excellent problem-solving ability. Despite the great success, most existing …

Statistical rejection sampling improves preference optimization

T Liu, Y Zhao, R Joshi, M Khalman, M Saleh… - arxiv preprint arxiv …, 2023 - arxiv.org
Improving the alignment of language models with human preferences remains an active
research challenge. Previous approaches have primarily utilized Reinforcement Learning …

Mathematical language models: A survey

W Liu, H Hu, J Zhou, Y Ding, J Li, J Zeng, M He… - arxiv preprint arxiv …, 2023 - arxiv.org
In recent years, there has been remarkable progress in leveraging Language Models (LMs),
encompassing Pre-trained Language Models (PLMs) and Large-scale Language Models …

Open-Ethical AI: Advancements in Open-Source Human-Centric Neural Language Models

S Sicari, JF Cevallos M, A Rizzardi… - ACM Computing …, 2024 - dl.acm.org
This survey summarises the most recent methods for building and assessing helpful, honest,
and harmless neural language models, considering small, medium, and large-size models …

Amortizing intractable inference in large language models

EJ Hu, M Jain, E Elmoznino, Y Kaddar, G Lajoie… - arxiv preprint arxiv …, 2023 - arxiv.org
Autoregressive large language models (LLMs) compress knowledge from their training data
through next-token conditional distributions. This limits tractable querying of this knowledge …

When Do Program-of-Thought Works for Reasoning?

Z Bi, N Zhang, Y Jiang, S Deng, G Zheng… - Proceedings of the AAAI …, 2024 - ojs.aaai.org
The reasoning capabilities of large language models (LLMs) play a pivotal role in the realm
of embodied artificial intelligence. Although there are effective methods like program-of …

Knowledgeable preference alignment for llms in domain-specific question answering

Y Zhang, Z Chen, Y Fang, Y Lu, F Li, W Zhang… - arxiv preprint arxiv …, 2023 - arxiv.org
Deploying large language models (LLMs) to real scenarios for domain-specific question
answering (QA) is a key thrust for LLM applications, which poses numerous challenges …

Mmicl: Empowering vision-language model with multi-modal in-context learning

H Zhao, Z Cai, S Si, X Ma, K An, L Chen, Z Liu… - arxiv preprint arxiv …, 2023 - arxiv.org
Starting from the resurgence of deep learning, vision-language models (VLMs) benefiting
from large language models (LLMs) have never been so popular. However, while LLMs can …