Run coding agents (Claude Code, Codex, Aider, OpenHands, …) against prompts in container sandboxes via Harbor

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

aicraft

Run coding agents (Claude Code, Codex, Aider, OpenHands, …) against prompts in container sandboxes. A thin Python library + CLI on top of the Harbor framework, focused on ad-hoc single-prompt runs rather than full benchmark evaluation.

Status: alpha. API and CLI may shift between 0.1.x releases. Pin minor versions if you depend on it.

Why this exists

Harbor itself is built around full benchmark lifecycles: scaffolding a task directory, running an agent over a dataset, scoring with a verifier, publishing trajectories. That's heavy when all you want is "run agent X on prompt Y, give me the answer."

aicraft flips the abstraction:

Inline prompt or --prompt-file instead of a hand-built task directory
Verifier disabled by default
Auto-synthesizes the minimal Harbor task structure under the hood
ATIF-aware final-text extraction (works across any agent that emits ATIF)
Mount allowlist enforcement (refuses to mount paths outside configured roots)
Provider presets (--provider openrouter etc.) for routing OpenAI-protocol agents through aggregator gateways
Workarounds for current Harbor 0.4 quirks (Codex OPENAI_BASE_URL propagation)

It's the opinionated, ad-hoc-friendly subset of Harbor — like gh is to a git push origin <branch> && curl github.com/... chain.

Install

pip install aicraft

aicraft requires Python 3.12+ and a working Docker daemon (or Podman with the Docker compatibility socket — set DOCKER_HOST=unix://$XDG_RUNTIME_DIR/podman/podman.sock). Harbor itself spawns each agent in a Docker container.

Usage

Library

import asyncio
from pathlib import Path
from aicraft import AgentRunner, AgentConfig, MountSpec

async def main():
    runner = AgentRunner()
    result = await runner.run(AgentConfig(
        prompt="Summarize the structure of /workspace/code in one paragraph.",
        agent="claude-code",
        mounts=[MountSpec(host=Path("/data/repo"), container=Path("/workspace/code"))],
        timeout_s=600,
    ))
    print(result.status)         # "completed" | "timeout" | "error"
    print(result.final_text)     # agent's textual reply (extracted from ATIF)
    print(result.trajectory_path)  # full trajectory on disk

asyncio.run(main())

CLI

# Inline prompt
aicraft run -a claude-code "Find dead code in this repo"

# Prompt from a file
aicraft run -a codex -M gpt-5 -f ./prompt.md

# With a mounted code directory
AICRAFT_MOUNT_ROOTS=/data/repos aicraft run -a claude-code \
  "Refactor the parser for readability" \
  --mount /data/repos/foo:/workspace/code:ro

# Specify model and a higher timeout, write structured result to file
aicraft run -a claude-code "..." \
  --model claude-sonnet-4-7 \
  --timeout 1800 \
  --output ./result.json

# Route an OpenAI-protocol agent through OpenRouter for open-weight models
OPENROUTER_API_KEY=sk-or-... aicraft run \
  -a codex -M deepseek/deepseek-chat --provider openrouter \
  "Describe the bug in /workspace/buggy.py and fix it" \
  --mount /tmp/scratch:/workspace:rw

# See installed agents
aicraft list-agents

# Locate a previously-run trajectory on disk
aicraft trajectory aicraft-3a2b1c0d9e8f

Mount syntax: host:container[:ro|rw] — default is ro. Pass :rw explicitly when the agent needs to write back.

Output

aicraft run prints a JSON document to stdout for pipeline use:

{
  "status": "completed",
  "final_text": "def reverse_string(s: str) -> str:\n    return s[::-1]",
  "trajectory_path": "/path/to/trajectories/aicraft-7dad38f46662",
  "duration_s": 65.1,
  "trial_id": "aicraft-7dad38f46662",
  "error": null
}

…and a human-readable banner to stderr so the agent's reply is easy to spot amid debug logs:

════════════════════════════════════════════════════════════════════════
 AGENT OUTPUT  [OK]  trial=aicraft-7dad38f46662  65.1s
════════════════════════════════════════════════════════════════════════
def reverse_string(s: str) -> str:
    return s[::-1]
════════════════════════════════════════════════════════════════════════

Exit codes: 0 on completed, 1 on timeout / error, 2 on a pre-flight config issue (mount not allowed, missing required model, unknown provider).

The trajectory directory contains everything Harbor captured: the ATIF trajectory.json, the agent's raw session logs, and any verifier artifacts. For an aicraft run there's no verifier output, but the ATIF trajectory has the full tool-call history.

Configuration

Variable	Purpose	Default
`AICRAFT_MOUNT_ROOTS`	Colon-separated absolute host paths permitted as mount sources. Empty means no mounts allowed.	empty
`AICRAFT_TRAJECTORY_DIR`	Where captured trajectories are stored. Override per-call with `--trajectory-dir`.	`./trajectories/` (cwd)
`OPENAI_API_KEY`, `OPENAI_BASE_URL`, `ANTHROPIC_API_KEY`, `ANTHROPIC_BASE_URL`, `GEMINI_API_KEY`, `GOOGLE_API_KEY`, `CLAUDE_CODE_OAUTH_TOKEN`	Standard provider env vars consumed by Harbor agents.	—
`OPENROUTER_API_KEY`, `FIREWORKS_API_KEY`, `TOGETHER_API_KEY`, `GROQ_API_KEY`	Provider-specific keys consumed by `--provider <name>`.	—

Provider presets

--provider <name> rewrites OPENAI_BASE_URL and OPENAI_API_KEY from a friendly name plus a provider-specific env var, so OpenAI-protocol agents (codex, copilot-cli) can route through aggregator gateways without manual env juggling. Anthropic-protocol agents (claude-code) read ANTHROPIC_* directly and need separate setup.

Preset	Base URL	Key env var
`openrouter`	`https://openrouter.ai/api/v1`	`OPENROUTER_API_KEY`
`fireworks`	`https://api.fireworks.ai/inference/v1`	`FIREWORKS_API_KEY`
`together`	`https://api.together.xyz/v1`	`TOGETHER_API_KEY`
`groq`	`https://api.groq.com/openai/v1`	`GROQ_API_KEY`

Limitations

See GOTCHAS.md for the full list with reproduction steps and workarounds. Highlights:

First run per agent is slow — Harbor 0.4 re-installs the agent CLI in a fresh container on every trial (~45–55s for claude-code/codex). Image build is cached; container is fresh.
Rootless Podman + claude-code — the agent runs fine, but trajectory ingestion fails reading session JSONL files (claude-code writes them at 0600, and rootless Podman's userns mapping makes them unreadable from the host). Use Docker for claude-code, or other agents (codex, aider, nop) on Podman — they're unaffected.
final_text for non-ATIF agents — may be empty; the trajectory directory still has the full record.
Codex requires an explicit --model — Harbor's codex agent has no default. aicraft pre-validates this in <1s.
--provider is OpenAI-protocol only — claude-code (Anthropic protocol) needs ANTHROPIC_BASE_URL + ANTHROPIC_API_KEY set manually for gateways like OpenRouter.

Related issues filed upstream

harbor#1514 — Defer litellm/aiohttp imports to avoid ~25s startup overhead

License

MIT — see LICENSE.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

kashif

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.2

Apr 27, 2026

0.1.1

Apr 27, 2026

This version

0.1.0

Apr 25, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aicraft-0.1.0.tar.gz (22.3 kB view details)

Uploaded Apr 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

aicraft-0.1.0-py3-none-any.whl (18.1 kB view details)

Uploaded Apr 25, 2026 Python 3

File details

Details for the file aicraft-0.1.0.tar.gz.

File metadata

Download URL: aicraft-0.1.0.tar.gz
Upload date: Apr 25, 2026
Size: 22.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for aicraft-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`737abbfac75052324c273e567f4cc7277cfb797e56510dbf0ea9cbf8170e14fd`
MD5	`f28410ce1b9f6322a37e87fa10f788b1`
BLAKE2b-256	`582886fbcd6dffb1ddf94f8105b0ee89d3d0d872e60a2ac0030d8da242573ba6`

See more details on using hashes here.

Provenance

The following attestation bundles were made for aicraft-0.1.0.tar.gz:

Publisher: publish.yml on kashifpk/aicraft

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: aicraft-0.1.0.tar.gz
- Subject digest: 737abbfac75052324c273e567f4cc7277cfb797e56510dbf0ea9cbf8170e14fd
- Sigstore transparency entry: 1383637351
- Sigstore integration time: Apr 25, 2026
Source repository:
- Permalink: kashifpk/aicraft@9bade43a3b763363738ef3e214af163bdffabb3f
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/kashifpk
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@9bade43a3b763363738ef3e214af163bdffabb3f
- Trigger Event: push

File details

Details for the file aicraft-0.1.0-py3-none-any.whl.

File metadata

Download URL: aicraft-0.1.0-py3-none-any.whl
Upload date: Apr 25, 2026
Size: 18.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for aicraft-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`00192c8b4481f12c2a67f2d41f09fdc2cf1789a89206b49361e950b7ad61cfcc`
MD5	`0905c8a658f015f3d384e3812b540a68`
BLAKE2b-256	`b7d16cabcd6f08dc14fbab3a667c516f5bd2ffa043d32fdc264860b5340cbf3e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for aicraft-0.1.0-py3-none-any.whl:

Publisher: publish.yml on kashifpk/aicraft

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: aicraft-0.1.0-py3-none-any.whl
- Subject digest: 00192c8b4481f12c2a67f2d41f09fdc2cf1789a89206b49361e950b7ad61cfcc
- Sigstore transparency entry: 1383637399
- Sigstore integration time: Apr 25, 2026
Source repository:
- Permalink: kashifpk/aicraft@9bade43a3b763363738ef3e214af163bdffabb3f
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/kashifpk
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@9bade43a3b763363738ef3e214af163bdffabb3f
- Trigger Event: push

aicraft 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

aicraft

Why this exists

Install

Usage

Library

CLI

Output

Configuration

Provider presets

Limitations

Related issues filed upstream

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance