Skip to main content

LangGraph agent for Lean4 theorem formalization and proving

Project description

ax-prover

A minimal agent for automated theorem proving in Lean 4

CI Python 3.11+ License: AGPL-3.0 PyPI version

A simple, modular agent that proves Lean 4 theorems through iterative refinement. It uses off-the-shelf LLMs (no fine-tuning) with a feedback loop, a memory system, and library search tools to achieve competitive results against highly-engineered systems that rely on specialized training and orders of magnitude more compute.

Key Results

Benchmark AxProverBase Best Comparable
PutnamBench 54.7% (pass@1) 13.0% (Goedel V2, pass@184)
FATE-M 98.0% 62.7% (DeepSeek V2, pass@64)
FATE-H 66.0% 3.0% (DeepSeek V2)
FATE-X 24.0% 0.0% (all others)
LeanCat 59.0% 14.0% (Gemini 3 Pro)

All results with Claude Opus 4.5, 50 iterations, pass@1. See our paper for full details and comparisons.

How It Works

ax-prover architecture

The agent runs an iterative loop:

  1. Proposer — An LLM writes Lean 4 proof code, optionally using tools (LeanSearch, web search) to find relevant Mathlib lemmas
  2. Compiler — Builds the code with lake; extracts goal states at sorry locations to provide structured feedback
  3. Reviewer — Verifies statement preservation and proof validity (no sorry, no cheating tactics)
  4. Memory — Summarizes lessons from failed attempts into a concise "lab notebook" to prevent repeating mistakes

The loop continues until the proof is complete or the iteration budget is exhausted (default: 50).

Quick Start

pip install ax-prover
# Configure your API keys
ax-prover configure

# Navigate to a Lean 4 project
cd /path/to/lean4-project

# Prove a theorem
ax-prover prove MyModule:my_theorem

Installation

pip install ax-prover
# or
uv add ax-prover

# For development (includes ruff, pytest, pre-commit)
pip install -e ".[dev]"
Prerequisites
  • Python 3.11+
  • Lean 4 with lake available on PATH (installation guide)
  • LLM API key — at least one of:
    • ANTHROPIC_API_KEY (recommended — Claude Opus 4.5 gives best results)
    • OPENAI_API_KEY
    • GOOGLE_API_KEY
  • Tavily API key (optional, for web search) — TAVILY_API_KEY

Set up your API keys interactively:

ax-prover configure

Or export them directly in your shell:

export ANTHROPIC_API_KEY=sk-ant-...

Usage

Proving theorems

# Prove a specific theorem by module path
ax-prover prove MyModule.Path:theorem_name

# Prove a specific theorem by file path
ax-prover prove MyProject/Algebra/Ring.lean:theorem_name

# Prove the theorem at a specific line
ax-prover prove MyProject/Algebra/Ring.lean#L42

# Prove all unproven theorems in a file
ax-prover prove MyProject/Algebra/Ring.lean

# Skip lake build (if repo is already built)
ax-prover prove MyModule:theorem_name --skip-build

# Save JSON output to file (for scripting/automation)
ax-prover prove MyModule:theorem_name -o result.json

Running experiments

Run batch evaluations on LangSmith datasets:

# Run experiment on a dataset
ax-prover experiment dataset_name

# With custom concurrency
ax-prover experiment dataset_name --max-concurrency 8
Configuration

Customize behavior with YAML config files and CLI overrides:

# my_config.yaml
prover:
  max_iterations: 75
  prover_llm:
    model: "anthropic:claude-opus-4-20250514"
    temperature: 0.5
    thinking:
      type: enabled
      budget_tokens: 32000
# Use a config file
ax-prover --config my_config.yaml prove MyModule:theorem

# Override values from the CLI
ax-prover prove MyModule:theorem prover.max_iterations=100

# Save your current configuration for later reuse
ax-prover --save-config my_setup prove MyModule:theorem

Contributing

We welcome contributions of all kinds — bug reports, feature requests, documentation, and code. See our Contributing Guide to get started.

License

This project is licensed under the AGPL-3.0.

Citation

If you use ax-prover in your research, please cite:

@article{axproverbase2026,
  title={A Minimal Agent for Automated Theorem Proving},
  author={Requena Pozo, Borja and Letson, Austin and Nowakowski, Krystian and Beltran Ferreiro, Izan and Sarra, Leopoldo},
  year={2026}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ax_prover-0.1.1.tar.gz (535.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ax_prover-0.1.1-py3-none-any.whl (82.0 kB view details)

Uploaded Python 3

File details

Details for the file ax_prover-0.1.1.tar.gz.

File metadata

  • Download URL: ax_prover-0.1.1.tar.gz
  • Upload date:
  • Size: 535.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ax_prover-0.1.1.tar.gz
Algorithm Hash digest
SHA256 11a3694dfea2b9adbac1cbcb7ab0e3a75b027d20d7b18e5e9f29620817626996
MD5 8c555981cfb17807f12d83afd6784a2e
BLAKE2b-256 08b1fcad8e1537111dccc60bade2dc6280011d674a4f82c69c5987ca1dfa9ea9

See more details on using hashes here.

Provenance

The following attestation bundles were made for ax_prover-0.1.1.tar.gz:

Publisher: publish.yml on Axiomatic-AI/ax-prover-base

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ax_prover-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: ax_prover-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 82.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ax_prover-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4935c0b74f89b9d1546bd2feccad9da332eac08c11ed1af6e8a56e725c46c042
MD5 f6826e6e9bb07312474f26d6e7ec0048
BLAKE2b-256 6ad48357c772a8ec1b5df75d1195609ded36a6a85a1ed05bf739d0543de42e8b

See more details on using hashes here.

Provenance

The following attestation bundles were made for ax_prover-0.1.1-py3-none-any.whl:

Publisher: publish.yml on Axiomatic-AI/ax-prover-base

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page