A survey of deep learning for mathematical reasoning

P Lu, L Qiu, W Yu, S Welleck, KW Chang - arxiv preprint arxiv:2212.10535, 2022 - arxiv.org
Mathematical reasoning is a fundamental aspect of human intelligence and is applicable in
various fields, including science, engineering, finance, and everyday life. The development …

HOL Light: A tutorial introduction

J Harrison - International Conference on Formal Methods in …, 1996 - Springer
HOL Light is a new version of the HOL theorem prover. While retaining the reliability and
programmability of earlier versions, it is more elegant, lightweight, powerful and automatic; it …

Solving olympiad geometry without human demonstrations

TH Trinh, Y Wu, QV Le, H He, T Luong - Nature, 2024 - nature.com
Proving mathematical theorems at the olympiad level represents a notable milestone in
human-level automated reasoning,,–, owing to their reputed difficulty among the world's best …

Logic-lm: Empowering large language models with symbolic solvers for faithful logical reasoning

L Pan, A Albalak, X Wang, WY Wang - arxiv preprint arxiv:2305.12295, 2023 - arxiv.org
Large Language Models (LLMs) have shown human-like reasoning abilities but still struggle
with complex logical problems. This paper introduces a novel framework, Logic-LM, which …

Draft, sketch, and prove: Guiding formal theorem provers with informal proofs

AQ Jiang, S Welleck, JP Zhou, W Li, J Liu… - arxiv preprint arxiv …, 2022 - arxiv.org
The formalization of existing mathematical proofs is a notoriously difficult process. Despite
decades of research on automation and proof assistants, writing formal proofs remains …

A survey of reasoning with foundation models

J Sun, C Zheng, E **e, Z Liu, R Chu, J Qiu, J Xu… - arxiv preprint arxiv …, 2023 - arxiv.org
Reasoning, a crucial ability for complex problem-solving, plays a pivotal role in various real-
world settings such as negotiation, medical diagnosis, and criminal investigation. It serves …

Putnambench: Evaluating neural theorem-provers on the putnam mathematical competition

G Tsoukalas, J Lee, J Jennings, J **n… - Advances in …, 2025 - proceedings.neurips.cc
We present PutnamBench, a new multi-language benchmark for evaluating the ability of
neural theorem-provers to solve competition mathematics problems. PutnamBench consists …

Lego-prover: Neural theorem proving with growing libraries

H Wang, H **n, C Zheng, L Li, Z Liu, Q Cao… - arxiv preprint arxiv …, 2023 - arxiv.org
Despite the success of large language models (LLMs), the task of theorem proving still
remains one of the hardest reasoning tasks that is far from being fully solved. Prior methods …

An orchestrated survey of methodologies for automated software test case generation

S Anand, EK Burke, TY Chen, J Clark… - Journal of systems and …, 2013 - Elsevier
Test case generation is among the most labour-intensive tasks in software testing. It also has
a strong impact on the effectiveness and efficiency of software testing. For these reasons, it …

A survey of neural code intelligence: Paradigms, advances and beyond

Q Sun, Z Chen, F Xu, K Cheng, C Ma, Z Yin… - arxiv preprint arxiv …, 2024 - arxiv.org
Neural Code Intelligence--leveraging deep learning to understand, generate, and optimize
code--holds immense potential for transformative impacts on the whole society. Bridging the …