LangGraph agent for Lean4 theorem formalization and proving
Project description
ax-prover
A minimal agent for automated theorem proving in Lean 4
A simple, modular agent that proves Lean 4 theorems through iterative refinement. It uses off-the-shelf LLMs (no fine-tuning) with a feedback loop, a memory system, and library search tools to achieve competitive results against highly-engineered systems that rely on specialized training and orders of magnitude more compute.
Key Results
| Benchmark | AxProverBase | Best Comparable |
|---|---|---|
| PutnamBench | 54.7% (pass@1) | 13.0% (Goedel V2, pass@184) |
| FATE-M | 98.0% | 62.7% (DeepSeek V2, pass@64) |
| FATE-H | 66.0% | 3.0% (DeepSeek V2) |
| FATE-X | 24.0% | 0.0% (all others) |
| LeanCat | 59.0% | 14.0% (Gemini 3 Pro) |
All results with Claude Opus 4.5, 50 iterations, pass@1. See our paper for full details and comparisons.
How It Works
The agent runs an iterative loop:
- Proposer — An LLM writes Lean 4 proof code, optionally using tools (LeanSearch, web search) to find relevant Mathlib lemmas
- Compiler — Builds the code with
lake; extracts goal states atsorrylocations to provide structured feedback - Reviewer — Verifies statement preservation and proof validity (no
sorry, no cheating tactics) - Memory — Summarizes lessons from failed attempts into a concise "lab notebook" to prevent repeating mistakes
The loop continues until the proof is complete or the iteration budget is exhausted (default: 50).
Quick Start
pip install ax-prover
# Configure your API keys
ax-prover configure
# Navigate to a Lean 4 project
cd /path/to/lean4-project
# Prove a theorem
ax-prover prove MyModule:my_theorem
Installation
pip install ax-prover
# or
uv add ax-prover
# For development (includes ruff, pytest, pre-commit)
pip install -e ".[dev]"
Prerequisites
- Python 3.11+
- Lean 4 with
lakeavailable on PATH (installation guide) - LLM API key — at least one of:
ANTHROPIC_API_KEY(recommended — Claude Opus 4.5 gives best results)OPENAI_API_KEYGOOGLE_API_KEY
- Tavily API key (optional, for web search) —
TAVILY_API_KEY
Set up your API keys interactively:
ax-prover configure
Or export them directly in your shell:
export ANTHROPIC_API_KEY=sk-ant-...
Usage
Proving theorems
# Prove a specific theorem by module path
ax-prover prove MyModule.Path:theorem_name
# Prove a specific theorem by file path
ax-prover prove MyProject/Algebra/Ring.lean:theorem_name
# Prove the theorem at a specific line
ax-prover prove MyProject/Algebra/Ring.lean#L42
# Prove all unproven theorems in a file
ax-prover prove MyProject/Algebra/Ring.lean
# Skip lake build (if repo is already built)
ax-prover prove MyModule:theorem_name --skip-build
# Save JSON output to file (for scripting/automation)
ax-prover prove MyModule:theorem_name -o result.json
Running experiments
Run batch evaluations on LangSmith datasets:
# Run experiment on a dataset
ax-prover experiment dataset_name
# With custom concurrency
ax-prover experiment dataset_name --max-concurrency 8
Configuration
Customize behavior with YAML config files and CLI overrides:
# my_config.yaml
prover:
max_iterations: 75
prover_llm:
model: "anthropic:claude-opus-4-20250514"
temperature: 0.5
thinking:
type: enabled
budget_tokens: 32000
# Use a config file
ax-prover --config my_config.yaml prove MyModule:theorem
# Override values from the CLI
ax-prover prove MyModule:theorem prover.max_iterations=100
# Save your current configuration for later reuse
ax-prover --save-config my_setup prove MyModule:theorem
Contributing
We welcome contributions of all kinds — bug reports, feature requests, documentation, and code. See our Contributing Guide to get started.
License
This project is licensed under the AGPL-3.0.
Citation
If you use ax-prover in your research, please cite:
@article{axproverbase2026,
title={A Minimal Agent for Automated Theorem Proving},
author={Requena Pozo, Borja and Letson, Austin and Nowakowski, Krystian and Beltran Ferreiro, Izan and Sarra, Leopoldo},
year={2026}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ax_prover-0.1.0.tar.gz.
File metadata
- Download URL: ax_prover-0.1.0.tar.gz
- Upload date:
- Size: 536.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eca159506e032168ba4c06cb4d856ade4e6ae32ca582b6d6c1ac333290f5ae9a
|
|
| MD5 |
ea676506dcabb9ffdca1064334c945fd
|
|
| BLAKE2b-256 |
f19e98e9def1af4846d0b45e775e603441ced185a75136fb205a7ae1fe4e71b1
|
File details
Details for the file ax_prover-0.1.0-py3-none-any.whl.
File metadata
- Download URL: ax_prover-0.1.0-py3-none-any.whl
- Upload date:
- Size: 82.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
93c1ddf7d594101a60713192acc0d8f65414577bc0e790ca466fd5d47d24aa14
|
|
| MD5 |
4f9c94a4b274b2cc68b1c25ec5648b5d
|
|
| BLAKE2b-256 |
6f0a1ab27bc52239f87a9a4792068ce92369fa80c77163807ef87439eeca6ab9
|