Skip to main content

LangGraph agent for Lean4 theorem formalization and proving

Project description

ax-prover

A minimal agent for automated theorem proving in Lean 4

CI Python 3.11+ License: AGPL-3.0 PyPI version

A simple, modular agent that proves Lean 4 theorems through iterative refinement. It uses off-the-shelf LLMs (no fine-tuning) with a feedback loop, a memory system, and library search tools to achieve competitive results against highly-engineered systems that rely on specialized training and orders of magnitude more compute.

Key Results

Benchmark AxProverBase Best Comparable
PutnamBench 54.7% (pass@1) 13.0% (Goedel V2, pass@184)
FATE-M 98.0% 62.7% (DeepSeek V2, pass@64)
FATE-H 66.0% 3.0% (DeepSeek V2)
FATE-X 24.0% 0.0% (all others)
LeanCat 59.0% 14.0% (Gemini 3 Pro)

All results with Claude Opus 4.5, 50 iterations, pass@1. See our paper for full details and comparisons.

How It Works

ax-prover architecture

The agent runs an iterative loop:

  1. Proposer — An LLM writes Lean 4 proof code, optionally using tools (LeanSearch, web search) to find relevant Mathlib lemmas
  2. Compiler — Builds the code with lake; extracts goal states at sorry locations to provide structured feedback
  3. Reviewer — Verifies statement preservation and proof validity (no sorry, no cheating tactics)
  4. Memory — Summarizes lessons from failed attempts into a concise "lab notebook" to prevent repeating mistakes

The loop continues until the proof is complete or the iteration budget is exhausted (default: 50).

Quick Start

pip install ax-prover
# Configure your API keys
ax-prover configure

# Navigate to a Lean 4 project
cd /path/to/lean4-project

# Prove a theorem
ax-prover prove MyModule:my_theorem

Installation

pip install ax-prover
# or
uv add ax-prover

# For development (includes ruff, pytest, pre-commit)
pip install -e ".[dev]"
Prerequisites
  • Python 3.11+
  • Lean 4 with lake available on PATH (installation guide)
  • LLM API key — at least one of:
    • ANTHROPIC_API_KEY (recommended — Claude Opus 4.5 gives best results)
    • OPENAI_API_KEY
    • GOOGLE_API_KEY
  • Tavily API key (optional, for web search) — TAVILY_API_KEY

Set up your API keys interactively:

ax-prover configure

Or export them directly in your shell:

export ANTHROPIC_API_KEY=sk-ant-...

Usage

Proving theorems

# Prove a specific theorem by module path
ax-prover prove MyModule.Path:theorem_name

# Prove a specific theorem by file path
ax-prover prove MyProject/Algebra/Ring.lean:theorem_name

# Prove the theorem at a specific line
ax-prover prove MyProject/Algebra/Ring.lean#L42

# Prove all unproven theorems in a file
ax-prover prove MyProject/Algebra/Ring.lean

# Skip lake build (if repo is already built)
ax-prover prove MyModule:theorem_name --skip-build

# Save JSON output to file (for scripting/automation)
ax-prover prove MyModule:theorem_name -o result.json

Running experiments

Run batch evaluations on LangSmith datasets:

# Run experiment on a dataset
ax-prover experiment dataset_name

# With custom concurrency
ax-prover experiment dataset_name --max-concurrency 8
Configuration

Customize behavior with YAML config files and CLI overrides:

# my_config.yaml
prover:
  max_iterations: 75
  prover_llm:
    model: "anthropic:claude-opus-4-20250514"
    temperature: 0.5
    thinking:
      type: enabled
      budget_tokens: 32000
# Use a config file
ax-prover --config my_config.yaml prove MyModule:theorem

# Override values from the CLI
ax-prover prove MyModule:theorem prover.max_iterations=100

# Save your current configuration for later reuse
ax-prover --save-config my_setup prove MyModule:theorem

Contributing

We welcome contributions of all kinds — bug reports, feature requests, documentation, and code. See our Contributing Guide to get started.

License

This project is licensed under the AGPL-3.0.

Citation

If you use ax-prover in your research, please cite:

@article{axproverbase2026,
  title={A Minimal Agent for Automated Theorem Proving},
  author={Requena Pozo, Borja and Letson, Austin and Nowakowski, Krystian and Beltran Ferreiro, Izan and Sarra, Leopoldo},
  year={2026}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ax_prover-0.1.0.tar.gz (536.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ax_prover-0.1.0-py3-none-any.whl (82.0 kB view details)

Uploaded Python 3

File details

Details for the file ax_prover-0.1.0.tar.gz.

File metadata

  • Download URL: ax_prover-0.1.0.tar.gz
  • Upload date:
  • Size: 536.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for ax_prover-0.1.0.tar.gz
Algorithm Hash digest
SHA256 eca159506e032168ba4c06cb4d856ade4e6ae32ca582b6d6c1ac333290f5ae9a
MD5 ea676506dcabb9ffdca1064334c945fd
BLAKE2b-256 f19e98e9def1af4846d0b45e775e603441ced185a75136fb205a7ae1fe4e71b1

See more details on using hashes here.

File details

Details for the file ax_prover-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ax_prover-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 82.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for ax_prover-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 93c1ddf7d594101a60713192acc0d8f65414577bc0e790ca466fd5d47d24aa14
MD5 4f9c94a4b274b2cc68b1c25ec5648b5d
BLAKE2b-256 6f0a1ab27bc52239f87a9a4792068ce92369fa80c77163807ef87439eeca6ab9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page