Skip to main content

Modal-backed MCP server for parallel evaluation of agent-generated code variants

Project description

Evolution MCP

Modal-backed MCP server for evaluating multiple agent-generated code variants in parallel.

Codex runs locally and generates candidate patches or full-file writes. Modal only evaluates those variants:

  1. Sync the dirty local repo into a Modal image with Image.add_local_dir.
  2. Start a setup sandbox and run setup commands once.
  3. Snapshot the prepared filesystem with Sandbox.snapshot_filesystem.
  4. Start run_experiment; it returns immediately with status: running.
  5. Enqueue a durable Redis Stream job.
  6. A worker process claims the job and forks one Modal sandbox per variant.
  7. Apply each patch or full-file write inside its sandbox.
  8. Run the same eval command in each sandbox.
  9. Persist result metadata and artifacts to Redis.
  10. Poll get_experiment until completed.
  11. Optionally call apply_winner to apply the winning diff locally.

Install

uv sync

For a published package, the intended install shape is:

uvx --from evolution-mcp evolution-mcp-doctor

Configure Modal

uv run modal setup

The controller uses your local Modal credentials to create images, sandboxes, and snapshots. Eval sandboxes do not need OpenAI credentials. Only pass secret_names when the repo's own setup or tests need secrets.

Configure Redis

V2 uses Redis for durable workspace records, experiment records, statuses, and small artifacts.

export REDIS_URL=redis://localhost:6379/0

Redis must be running before normal use:

brew install redis
brew services start redis

For dev-only local JSON persistence:

export EVOLUTION_MCP_STORAGE=local

Doctor

uv run evolution-mcp-doctor

This checks storage, checks the active Modal profile, and prints a Codex MCP config snippet.

Run The MCP Server

uvx --from evolution-mcp evolution-mcp

Run A Worker

Run at least one worker process alongside the MCP server:

uvx --from evolution-mcp evolution-mcp-worker

Workers consume Redis Stream jobs. They acknowledge a job only after writing the completed or failed experiment record, so unacked jobs can be reclaimed by another worker after --reclaim-after-ms.

Add To Codex

Example MCP config:

{
  "mcpServers": {
    "evolution-mcp": {
      "command": "uvx",
      "args": ["--from", "evolution-mcp", "evolution-mcp"],
      "env": {
        "REDIS_URL": "redis://localhost:6379/0"
      }
    }
  }
}

Tool Flow

Prepare once:

{
  "repo_path": "/Users/me/project",
  "setup_commands": ["pytest --version"]
}

Start async experiment:

{
  "workspace_id": "ws_...",
  "variants": [
    {
      "name": "minimal-fix",
      "patch": "diff --git a/app.py b/app.py\n..."
    },
    {
      "name": "boundary-fix",
      "files": [
        {"path": "app.py", "content": "...full file content..."}
      ]
    }
  ],
  "eval_command": "pytest -q",
  "parallelism": 2
}

run_experiment returns immediately:

{
  "experiment_id": "exp_...",
  "status": "running"
}

Poll:

{
  "experiment_id": "exp_..."
}

Completed result:

{
  "status": "completed",
  "winner": "minimal-fix",
  "results": []
}

Example Agent Prompt

Generate 2-3 meaningfully different candidate fixes as unified diffs or
full-file variants. Call prepare_workspace once, then call run_experiment with
those variants and the exact eval command. Poll get_experiment until the status
is completed, failed, or no_passing_variant. Do not call apply_winner unless the
user explicitly asks to apply the selected diff.

Smoke Run

EVOLUTION_MCP_STORAGE=local uv run evolution-modal-smoke /path/to/repo "pytest -q"

The smoke command runs a local worker loop for the queued smoke job. In normal usage, keep evolution-mcp-worker running separately.

Tests

PYTHONPATH=src python3 -m unittest discover -s tests
uv run --extra dev ruff check .

Known Limitations

  • Full logs/diffs are stored in Redis for V2. Large artifacts should move to blob storage or Modal Volume later.
  • Modal SDK 1.4.3 does not expose snapshot image deletion; cleanup removes local/Redis records and tracks the snapshot ID.
  • apply_winner applies a stored diff to the local repo with git apply; if the repo changed since the experiment, the patch can fail.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

evolution_mcp-0.1.1.tar.gz (21.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

evolution_mcp-0.1.1-py3-none-any.whl (21.7 kB view details)

Uploaded Python 3

File details

Details for the file evolution_mcp-0.1.1.tar.gz.

File metadata

  • Download URL: evolution_mcp-0.1.1.tar.gz
  • Upload date:
  • Size: 21.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.22 {"installer":{"name":"uv","version":"0.9.22","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for evolution_mcp-0.1.1.tar.gz
Algorithm Hash digest
SHA256 50d890d5c88aafbce45dcf0727da6662a399b6b5dbb100cbee046d4ea47fbaed
MD5 a3507a252dfa764bf75e9b393e913c4f
BLAKE2b-256 b66a9f9171a74168ad9a4f8a6fba0b3eeef8a775b976a49a0a09268032a8d45c

See more details on using hashes here.

File details

Details for the file evolution_mcp-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: evolution_mcp-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 21.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.22 {"installer":{"name":"uv","version":"0.9.22","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for evolution_mcp-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0fa75467c3e48a9c58b34c23b7e417c0f37ef4eb1a41fd1b0e0a112fc6cfa216
MD5 147302b6b8f6197f33b9bab7cc29ba4d
BLAKE2b-256 4db4605dfbda62f5853295ec848626ff7dde7e74fe26801417db0fbf13c2f455

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page