Metamath: Bootstrap your own mathematical questions for large language models
Large language models (LLMs) have pushed the limits of natural language understanding
and exhibited excellent problem-solving ability. Despite the great success, most existing …
and exhibited excellent problem-solving ability. Despite the great success, most existing …
Adapting large language models for education: Foundational capabilities, potentials, and challenges
Online education platforms, leveraging the internet to distribute education resources, seek to
provide convenient education but often fall short in real-time communication with students …
provide convenient education but often fall short in real-time communication with students …
Genartist: Multimodal llm as an agent for unified image generation and editing
Despite the success achieved by existing image generation and editing methods, current
models still struggle with complex problems including intricate text prompts, and the …
models still struggle with complex problems including intricate text prompts, and the …
Trigo: Benchmarking formal mathematical proof reduction for generative language models
Automated theorem proving (ATP) has become an appealing domain for exploring the
reasoning ability of the recent successful generative language models. However, current …
reasoning ability of the recent successful generative language models. However, current …
Deepseek-prover-v1. 5: Harnessing proof assistant feedback for reinforcement learning and monte-carlo tree search
We introduce DeepSeek-Prover-V1. 5, an open-source language model designed for
theorem proving in Lean 4, which enhances DeepSeek-Prover-V1 by optimizing both …
theorem proving in Lean 4, which enhances DeepSeek-Prover-V1 by optimizing both …
Lean-github: Compiling github lean repositories for a versatile lean prover
Recently, large language models have presented promising results in aiding formal
mathematical reasoning. However, their performance is restricted due to the scarcity of …
mathematical reasoning. However, their performance is restricted due to the scarcity of …
Formal mathematical reasoning: A new frontier in ai
AI for Mathematics (AI4Math) is not only intriguing intellectually but also crucial for AI-driven
discovery in science, engineering, and beyond. Extensive efforts on AI4Math have mirrored …
discovery in science, engineering, and beyond. Extensive efforts on AI4Math have mirrored …
Benchmarking large language models for math reasoning tasks
The use of Large Language Models (LLMs) in mathematical reasoning has become a
cornerstone of related research, demonstrating the intelligence of these models and …
cornerstone of related research, demonstrating the intelligence of these models and …
Trove: Inducing verifiable and efficient toolboxes for solving programmatic tasks
Language models (LMs) can solve tasks such as answering questions about tables or
images by writing programs. However, using primitive functions often leads to verbose and …
images by writing programs. However, using primitive functions often leads to verbose and …
AutoVerus: Automated proof generation for Rust code
Generative AI has shown its values for many software engineering tasks. Still in its infancy,
large language model (LLM)-based proof generation lags behind LLM-based code …
large language model (LLM)-based proof generation lags behind LLM-based code …