Skip to main content

A recursive, reflective POETRY algorithm variant using Goedel-Prover-V2

Project description

Gödel's Poetry

Release Build status codecov Commit activity License

A recursive, reflective POETRY algorithm variant using Goedel-Prover-V2

Gödel's Poetry is an advanced automated theorem proving system that combines Large Language Models (LLMs) with formal verification in Lean 4. The system takes mathematical theorems—either in informal natural language or formal Lean syntax—and automatically generates verified proofs through a sophisticated multi-agent architecture.


Table of Contents


What Does Gödel's Poetry Do?

Gödel's Poetry is an AI-powered theorem proving system that bridges the gap between informal mathematical reasoning and formal verification. The system:

  1. Accepts theorems in multiple formats:

    • Informal natural language (e.g., "Prove that the square root of 2 is irrational")
    • Formal Lean 4 syntax (e.g., theorem sqrt_two_irrational : Irrational (√2) := by sorry)
  2. Automatically generates verified proofs through a multi-agent workflow:

    • Formalization: Converts informal statements into formal Lean 4 theorems
    • Semantic Checking: Validates that formalizations preserve the original meaning
    • Proof Generation: Creates proofs using specialized LLMs trained on Lean 4
    • Proof Sketching: Decomposes difficult theorems into manageable subgoals
    • Verification: Validates all proofs using the Lean 4 proof assistant
    • Recursive Refinement: Iteratively improves proofs until they are complete and verified
  3. Leverages state-of-the-art technology:

    • Custom fine-tuned models (Goedel-Prover-V2, Goedel-Formalizer-V2)
    • Integration with frontier LLMs (GPT-5, Qwen3)
    • The Kimina Lean Server for high-performance Lean 4 verification
    • LangGraph for orchestrating complex multi-agent workflows

The system is designed for researchers, mathematicians, and AI practitioners interested in automated theorem proving, formal verification, and the intersection of natural and formal languages.


Quick Start

Prerequisites

Before installing Gödel's Poetry, ensure you have:

  • Python 3.9 or higher (tested on Python 3.9-3.13)
  • pip (comes with Python)
  • Lean 4 for the Kimina server (installation covered below)

For development:

  • uv - Fast Python package installer (optional, but recommended for development)
    curl -LsSf https://astral.sh/uv/install.sh | sh
    
  • Git for cloning the repository

Installation

Option 1: Install from PyPI (Recommended)

# Install using pip
pip install goedels-poetry

# Verify installation
goedels_poetry --help

Option 2: Install from Source (For Development)

# Clone the repository
git clone https://github.com/KellyJDavis/goedels-poetry.git
cd goedels-poetry

# Install with uv (recommended) or pip
uv sync

# The command line tool is now available via:
uv run goedels_poetry --help

Running the Kimina Lean Server

The Kimina Lean Server is required for Gödel's Poetry to verify Lean 4 proofs. It provides high-performance parallel proof checking.

Setup Steps:

  1. Clone the Kimina Lean Server (separate repository):

    git clone https://github.com/KellyJDavis/kimina-lean-server.git
    cd kimina-lean-server
    
  2. Run the setup script (installs Lean 4, mathlib4, and dependencies):

    bash setup.sh
    

    This will:

    • Install Elan (the Lean version manager)
    • Install Lean 4 (default version v4.15.0)
    • Clone and build the Lean REPL
    • Clone and build the AST export tool
    • Clone and build mathlib4 (Lean's math library)

    ⚠️ Note: This process can take 15-30 minutes depending on your system.

  3. Install server dependencies:

    pip install -r requirements.txt
    pip install .
    prisma generate
    
  4. Start the server:

    python -m server
    

    The server will start on http://0.0.0.0:8000 by default.

  5. Verify the server is running (in a new terminal):

    curl --request POST \
      --url http://localhost:8000/verify \
      --header 'Content-Type: application/json' \
      --data '{
        "codes": [{"custom_id": "test", "proof": "#check Nat"}],
        "infotree_type": "original"
      }' | jq
    

Alternative: Docker (Production)

For production deployments, you can use Docker:

cd kimina-lean-server
docker compose up

See the Kimina Server README for more deployment options.

Setting Up Your API Keys

Gödel's Poetry supports both OpenAI and Google Generative AI for certain reasoning tasks. You can use either provider:

Option 1: OpenAI (Default)

  1. Get an API key from OpenAI's platform

  2. Set the environment variable:

    On Linux/macOS:

    export OPENAI_API_KEY='your-api-key-here'
    

    On Windows (Command Prompt):

    set OPENAI_API_KEY=your-api-key-here
    

    On Windows (PowerShell):

    $env:OPENAI_API_KEY='your-api-key-here'
    

Option 2: Google Generative AI

  1. Get an API key from Google AI Studio

  2. Set the environment variable:

    On Linux/macOS:

    export GOOGLE_API_KEY='your-api-key-here'
    

    On Windows (Command Prompt):

    set GOOGLE_API_KEY=your-api-key-here
    

    On Windows (PowerShell):

    $env:GOOGLE_API_KEY='your-api-key-here'
    

Provider Selection

The system automatically selects the provider based on available API keys:

  • If both keys are set, OpenAI takes priority
  • If only one key is set, that provider is used
  • If no keys are set, the system falls back to OpenAI with a warning
  1. Make it permanent (optional):

    Add the export command to your shell configuration file:

    • Bash: ~/.bashrc or ~/.bash_profile
    • Zsh: ~/.zshrc
    • Fish: ~/.config/fish/config.fish

Using the Command Line Tool

Once installed, you can use the goedels_poetry command to prove theorems:

Prove a Single Formal Theorem

goedels_poetry --formal-theorem "import Mathlib\n\nopen BigOperators\n\ntheorem theorem_54_43 : 1 + 1 = 2 := by sorry"

Formal theorems supplied on the command line (or via files) must include their full Lean preamble—imports, options, namespaces, and any comments required to state the theorem. Gödel's Poetry no longer prepends the default header for user-supplied formal problems. (The default header is still added automatically when an informal theorem is formalized by the system.)

Prove a Single Informal Theorem

goedels_poetry --informal-theorem "Prove that the sum of two even numbers is even"

Batch Process Multiple Theorems

Process all .lean files in a directory:

goedels_poetry --formal-theorems ./my-theorems/

Process all .txt files containing informal theorems:

goedels_poetry --informal-theorems ./informal-theorems/

For batch processing, the tool will:

  • Read each theorem from its file
  • Attempt to generate and verify a proof
  • Save results to .proof files alongside the originals

Get Help

goedels_poetry --help

Enable Debug Mode

To see detailed LLM and Kimina server responses during execution, set the GOEDELS_POETRY_DEBUG environment variable:

On Linux/macOS:

export GOEDELS_POETRY_DEBUG=1
goedels_poetry --formal-theorem "import Mathlib\n\nopen BigOperators\n\ntheorem theorem_54_43 : 1 + 1 = 2 := by sorry"

On Windows (Command Prompt):

set GOEDELS_POETRY_DEBUG=1
goedels_poetry --formal-theorem "import Mathlib\n\nopen BigOperators\n\ntheorem theorem_54_43 : 1 + 1 = 2 := by sorry"

On Windows (PowerShell):

$env:GOEDELS_POETRY_DEBUG=1
goedels_poetry --formal-theorem "import Mathlib`n`nopen BigOperators`n`ntheorem theorem_54_43 : 1 + 1 = 2 := by sorry"

When debug mode is enabled, all responses from:

  • FORMALIZER_AGENT_LLM - Formalization responses
  • PROVER_AGENT_LLM - Proof generation responses
  • SEMANTICS_AGENT_LLM - Semantic checking responses
  • DECOMPOSER_AGENT_LLM - Proof sketching/decomposition responses
  • KIMINA_SERVER - Lean 4 verification and AST parsing responses

will be printed to the console with rich formatting for easy debugging and inspection.


Examples

Example 1: Simple Arithmetic

goedels_poetry --formal-theorem \
  "import Mathlib\n\nopen BigOperators\n\ntheorem add_comm_example : 3 + 5 = 5 + 3 := by sorry"

Example 2: Informal Theorem

goedels_poetry --informal-theorem \
  "Prove that for any natural numbers a and b, a + b = b + a"

Example 3: Batch Processing

Create a directory with theorem files:

mkdir theorems
cat <<'EOF' > theorems/test1.lean
import Mathlib

open BigOperators

theorem test1 : 2 + 2 = 4 := by sorry
EOF

cat <<'EOF' > theorems/test2.lean
import Mathlib

open BigOperators

theorem test2 : 5 * 5 = 25 := by sorry
EOF

goedels_poetry --formal-theorems ./theorems/

Results will be saved as test1.proof and test2.proof.


How It Works

Gödel's Poetry uses a sophisticated multi-agent architecture coordinated by a supervisor agent. The workflow adapts based on the input:

For Informal Theorems:

  1. Formalizer Agent - Converts natural language to Lean 4 syntax
  2. Syntax Checker Agent - Validates the formal theorem syntax
  3. Semantics Agent - Ensures the formalization preserves meaning
  4. Prover Agent - Generates the proof
  5. Proof Checker Agent - Verifies the proof in Lean 4
  6. Parser Agent - Extracts the AST structure

For Complex Theorems (Recursive Decomposition):

When direct proving fails, the system activates proof sketching:

  1. Proof Sketcher Agent - Creates a high-level proof outline
  2. Sketch Checker Agent - Validates the sketch syntax
  3. Decomposition Agent - Extracts sub-theorems from the sketch
  4. Recursive Proving - Each sub-theorem is proved independently
  5. Proof Reconstruction - Combines verified sub-proofs into the final proof

Key Features:

  • Automatic Correction: Agents iteratively fix syntax and logical errors
  • Backtracking: When a decomposition approach fails, the system tries alternatives
  • State Management: Complete provenance tracking for reproducibility
  • Parallel Processing: Batch theorem proving with efficient resource usage

Developer Guide

Development Setup

  1. Clone and install with development dependencies:

    git clone https://github.com/KellyJDavis/goedels-poetry.git
    cd goedels-poetry
    make install
    

    This will:

    • Create a virtual environment using uv
    • Install all dependencies
    • Set up pre-commit hooks for code quality
  2. Activate the environment (if needed):

    source .venv/bin/activate  # Linux/macOS
    .venv\Scripts\activate     # Windows
    

Testing

The project includes comprehensive unit and integration tests.

Unit Tests Only (Fast)

make test

This runs all tests except those requiring Lean installation.

Integration Tests (Requires Lean Server)

Integration tests verify the Kimina Lean Server integration. These tests require a running Kimina Lean server.

First-time setup:

# Install integration test dependencies
uv sync

# Clone the Kimina Lean Server (if not already cloned)
cd .. && git clone https://github.com/KellyJDavis/kimina-lean-server.git
cd kimina-lean-server

# Install Lean and build dependencies (takes 15-30 minutes)
bash setup.sh

# Install server dependencies
pip install -r requirements.txt
pip install .
prisma generate

Run integration tests:

# Terminal 1: Start the Kimina server
cd ../kimina-lean-server
python -m server

# Terminal 2: Run the tests
cd ../goedels-poetry
make test-integration

The tests will automatically connect to http://localhost:8000. To use a different URL:

export KIMINA_SERVER_URL=http://localhost:9000
make test-integration

Note: Integration tests require Python 3.10+ and a running Lean server with proper REPL configuration.

All Tests

make test-all

This runs both unit and integration tests sequentially.

Makefile Targets

The repository provides several convenient Make targets:

Target Description
make install Install the virtual environment and pre-commit hooks
make check Run all code quality checks (linting, type checking, dependency audit)
make test Run unit tests with coverage (excludes integration tests)
make test-integration Run integration tests (requires Lean installation)
make test-all Run all tests (unit + integration)
make build Build wheel distribution package
make clean-build Remove build artifacts
make publish Publish to PyPI (requires credentials)
make docs Build and serve documentation locally
make docs-test Test documentation build without serving
make help Display all available targets with descriptions

Code Quality Tools

The make check target runs:

  • uv lock - Ensures lock file consistency
  • pre-commit - Runs linting and formatting (Ruff)
  • mypy - Static type checking
  • deptry - Checks for obsolete dependencies

Configuration

Default Configuration Parameters

Configuration is stored in goedels_poetry/data/config.ini:

[FORMALIZER_AGENT_LLM]
model = kdavis/goedel-formalizer-v2:32b
num_ctx = 40960
max_retries = 10

[PROVER_AGENT_LLM]
model = kdavis/Goedel-Prover-V2:32b
num_ctx = 40960
max_self_correction_attempts = 2
max_depth = 20
max_pass = 32

[SEMANTICS_AGENT_LLM]
model = qwen3:30b
num_ctx = 262144

[DECOMPOSER_AGENT_LLM]
# Provider selection (openai, google, auto)
provider = auto

# OpenAI-specific settings
openai_model = gpt-5-2025-08-07
openai_max_completion_tokens = 50000
openai_max_remote_retries = 5
openai_max_self_correction_attempts = 6

# Google-specific settings
google_model = gemini-2.5-pro
google_max_output_tokens = 50000
google_max_self_correction_attempts = 6

[KIMINA_LEAN_SERVER]
url = http://0.0.0.0:8000
max_retries = 5

Configuration Parameters Explained

Formalizer Agent:

  • model: The LLM used to convert informal theorems to Lean 4
  • num_ctx: Context window size (tokens)
  • max_retries: Maximum attempts to formalize a theorem

Prover Agent:

  • model: The LLM used to generate proofs
  • num_ctx: Context window size (tokens)
  • max_self_correction_attempts: Maximum proof generation self-correction attempts
  • max_depth: Maximum recursion depth for proof decomposition
  • max_pass: Maximum number of proof attempts before triggering decomposition

Semantics Agent:

  • model: The LLM used to validate semantic equivalence
  • num_ctx: Context window size (tokens)

Decomposer Agent:

  • provider: Provider selection (openai, google, or auto)
  • openai_model: The OpenAI model used for proof sketching (when OpenAI is selected)
  • openai_max_completion_tokens: Maximum tokens in OpenAI-generated response
  • openai_max_remote_retries: Retry attempts for OpenAI API calls
  • openai_max_self_correction_attempts: Maximum decomposition self-correction attempts for OpenAI
  • google_model: The Google model used for proof sketching (when Google is selected)
  • google_max_output_tokens: Maximum tokens in Google-generated response
  • google_max_self_correction_attempts: Maximum decomposition self-correction attempts for Google

Kimina Lean Server:

  • url: Server endpoint for Lean verification
  • max_retries: Maximum retry attempts for server requests

Overriding Configuration with Environment Variables

The recommended way to customize configuration is using environment variables. This approach doesn't require modifying files and works great for different environments (development, testing, production):

Format: SECTION__OPTION (double underscore separator, uppercase)

Examples:

# Use a different prover model
export PROVER_AGENT_LLM__MODEL="custom-model:latest"

# Change the Kimina server URL
export KIMINA_LEAN_SERVER__URL="http://localhost:9000"

# Use a smaller context window for faster testing
export PROVER_AGENT_LLM__NUM_CTX="8192"

# Run with custom configuration
goedels_poetry --formal-theorem "import Mathlib\n\nopen BigOperators\n\ntheorem theorem_54_43 : 1 + 1 = 2 := by sorry"

Multiple overrides:

export PROVER_AGENT_LLM__MODEL="kdavis/Goedel-Prover-V2:70b"
export PROVER_AGENT_LLM__MAX_SELF_CORRECTION_ATTEMPTS="3"
export PROVER_AGENT_LLM__MAX_PASS="64"
export DECOMPOSER_AGENT_LLM__OPENAI_MODEL="gpt-5-pro"
export KIMINA_LEAN_SERVER__MAX_RETRIES="10"
# Provide the full preamble plus theorem body when invoking formal problems
goedels_poetry --formal-theorem "import Mathlib\n\nopen BigOperators\n\ntheorem theorem_54_43 : 1 + 1 = 2 := by sorry"

Using Google Generative AI:

export GOOGLE_API_KEY="your-google-api-key"
export DECOMPOSER_AGENT_LLM__GOOGLE_MODEL="gemini-2.5-pro"
export DECOMPOSER_AGENT_LLM__GOOGLE_MAX_OUTPUT_TOKENS="100000"
goedels_poetry --formal-theorem "..."

Environment variables are optional - if not set, the system uses values from config.ini.

For more details and advanced configuration options, see CONFIGURATION.md.

Alternative: Modifying config.ini Directly

If you prefer, you can still modify the configuration file directly:

# Find the installation path
uv run python -c "import goedels_poetry; print(goedels_poetry.__file__)"

# Edit the config.ini in the installation directory
# Typically: .venv/lib/python3.x/site-packages/goedels_poetry/data/config.ini

Note: Direct file changes persist until you reinstall or update the package, while environment variables are more flexible and don't require reinstallation.

Contributing

We welcome contributions! Please see CONTRIBUTING.md for detailed guidelines.

Quick contribution workflow:

  1. Fork the repository
  2. Clone your fork: git clone git@github.com:YOUR_NAME/goedels-poetry.git
  3. Install development environment: make install
  4. Create a feature branch: git checkout -b feature-name
  5. Make your changes and add tests
  6. Run quality checks: make check
  7. Run tests: make test
  8. Commit with descriptive messages
  9. Push and create a pull request

Code quality requirements:

  • All tests must pass (make test)
  • Code must pass linting and type checking (make check)
  • New features should include tests and documentation
  • Follow the existing code style and conventions

Project Structure

goedels-poetry/
├── goedels_poetry/           # Main package
│   ├── agents/               # Multi-agent system components
│   │   ├── formalizer_agent.py
│   │   ├── prover_agent.py
│   │   ├── proof_checker_agent.py
│   │   ├── sketch_*.py       # Proof sketching agents
│   │   └── ...
│   ├── config/               # Configuration management
│   ├── data/                 # Prompts and config files
│   │   ├── config.ini
│   │   └── prompts/
│   ├── parsers/              # AST parsing utilities
│   ├── cli.py                # Command-line interface
│   ├── framework.py          # Core orchestration logic
│   └── state.py              # State management
├── tests/                    # Test suite
├── Makefile                  # Development automation
├── pyproject.toml            # Package configuration
├── CHANGELOG.md              # Version history
└── README.md                 # This file

Note: The Kimina Lean Server is a separate repository that must be installed and run independently.


License

This project is licensed under the Apache License 2.0. See the LICENSE file for details.


Acknowledgments

  • Kimina Lean Server: Built on Project Numina's excellent Lean verification server
  • Lean 4: The formal verification system that powers proof checking
  • LangChain & LangGraph: Frameworks for LLM orchestration
  • Mathlib4: Comprehensive mathematics library for Lean

Citation

If you use Gödel's Poetry in your research, please cite:

@software{goedels_poetry,
  author = {Davis, Kelly J.},
  title = {Gödel's Poetry: Recursive Automated Theorem Proving},
  year = {2025},
  url = {https://github.com/KellyJDavis/goedels-poetry}
}

For the Kimina Lean Server:

@misc{santos2025kiminaleanservertechnical,
  title={Kimina Lean Server: Technical Report},
  author={Marco Dos Santos and Haiming Wang and Hugues de Saxcé and Ran Wang and Mantas Baksys and Mert Unsal and Junqi Liu and Zhengying Liu and Jia Li},
  year={2025},
  eprint={2504.21230},
  archivePrefix={arXiv},
  primaryClass={cs.LO},
  url={https://arxiv.org/abs/2504.21230}
}

Support


Ready to prove some theorems? 🚀

goedels_poetry --informal-theorem "Prove that the sum of the first n natural numbers equals n(n+1)/2"

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

goedels_poetry-0.0.5.tar.gz (652.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

goedels_poetry-0.0.5-py3-none-any.whl (100.9 kB view details)

Uploaded Python 3

File details

Details for the file goedels_poetry-0.0.5.tar.gz.

File metadata

  • Download URL: goedels_poetry-0.0.5.tar.gz
  • Upload date:
  • Size: 652.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.14

File hashes

Hashes for goedels_poetry-0.0.5.tar.gz
Algorithm Hash digest
SHA256 2d575be1c27584fac804e700dc015e3ab2e02311158adaa670c063ef87afe731
MD5 6dd003e15aaf1e90f82d1a6358e65559
BLAKE2b-256 0e91178229340265ccdc2d4e9a64e1cf5f7f4dcabcf20357770615c4b60fbb00

See more details on using hashes here.

File details

Details for the file goedels_poetry-0.0.5-py3-none-any.whl.

File metadata

File hashes

Hashes for goedels_poetry-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 efb91cfc2d8f7cadb223ae29b88b7c93df80b31399d87341fa1ebca57f64635e
MD5 de0b672e181a09d234b981256bc842c6
BLAKE2b-256 4ffae005dfc9840bdd470d326332dee9c0f64c1b28c9c63b53a4b6d0e2133737

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page