Formal mathematical reasoning: A new frontier in ai

K Yang, G Poesia, J He, W Li, K Lauter… - arxiv preprint arxiv …, 2024 - arxiv.org
AI for Mathematics (AI4Math) is not only intriguing intellectually but also crucial for AI-driven
discovery in science, engineering, and beyond. Extensive efforts on AI4Math have mirrored …

Data for mathematical copilots: Better ways of presenting proofs for machine learning

S Frieder, J Bayer, KM Collins, J Berner… - arxiv preprint arxiv …, 2024 - arxiv.org
The suite of datasets commonly used to train and evaluate the mathematical capabilities of
AI-based mathematical copilots (primarily large language models) exhibit several …

Technical report: Enhancing llm reasoning with reward-guided tree search

J Jiang, Z Chen, Y Min, J Chen, X Cheng… - arxiv preprint arxiv …, 2024 - arxiv.org
Recently, test-time scaling has garnered significant attention from the research community,
largely due to the substantial advancements of the o1 model released by OpenAI. By …

Reasoning Language Models: A Blueprint

M Besta, J Barth, E Schreiber, A Kubicek… - arxiv preprint arxiv …, 2025 - arxiv.org
Reasoning language models (RLMs), also known as Large Reasoning Models (LRMs),
such as OpenAI's o1 and o3, DeepSeek-V3, and Alibaba's QwQ, have redefined AI's …

A Survey on Feedback-based Multi-step Reasoning for Large Language Models on Mathematics

TR Wei, H Liu, X Wu, Y Fang - arxiv preprint arxiv:2502.14333, 2025 - arxiv.org
Recent progress in large language models (LLM) found chain-of-thought prompting
strategies to improve the reasoning ability of LLMs by encouraging problem solving through …

PDE-Controller: LLMs for Autoformalization and Reasoning of PDEs

M Soroco, J Song, M **a, K Emond, W Sun… - arxiv preprint arxiv …, 2025 - arxiv.org
While recent AI-for-math has made strides in pure mathematics, areas of applied
mathematics, particularly PDEs, remain underexplored despite their significant real-world …

Mathematics and Machine Creativity: A Survey on Bridging Mathematics with AI

S Liang, W Zhang, T Zhong - arxiv preprint arxiv:2412.16543, 2024 - arxiv.org
This paper presents a comprehensive survey on the applications of artificial intelligence (AI)
in mathematical research, highlighting the transformative role AI has begun to play in this …

On the logical skills of large language models: evaluations using arbitrarily complex first-order logic problems

S Ibragimov, A Jentzen, B Kuckuck - arxiv preprint arxiv:2502.14180, 2025 - arxiv.org
We present a method of generating first-order logic statements whose complexity can be
controlled along multiple dimensions. We use this method to automatically create several …

Entropy-Guided Attention for Private LLMs

NK Jha, B Reagen - arxiv preprint arxiv:2501.03489, 2025 - arxiv.org
The pervasiveness of proprietary language models has raised critical privacy concerns,
necessitating advancements in private inference (PI), where computations are performed …

AERO: Softmax-Only LLMs for Efficient Private Inference

NK Jha, B Reagen - arxiv preprint arxiv:2410.13060, 2024 - arxiv.org
The pervasiveness of proprietary language models has raised privacy concerns for users'
sensitive data, emphasizing the need for private inference (PI), where inference is performed …