Skip to main content

Safety-oriented execution harness for coding agents

Project description

RepoAirlock

CI PyPI version Python versions License

Safety-oriented execution harness for coding agents.

English | 中文

30-second verification

Run the CLI health check and fast unit suite:

git clone https://github.com/ZedingZhang/repoairlock.git
cd repoairlock
python3.12 -m pip install -e ".[dev]"
repoairlock doctor
python3.12 -m pytest tests/unit -q

Problem

Coding agents modify code autonomously. Without isolation, recording, and policy enforcement, it is difficult to answer basic questions:

  • What did the agent do?
  • Did it touch anything outside the repo?
  • Can we replay its changes?
  • Was the original workspace preserved?

RepoAirlock answers these questions through isolation, structured audit trails, and reproducible artifacts.

What RepoAirlock Does

RepoAirlock runs coding agents inside isolated Docker containers with git worktree isolation, records structured execution traces, enforces safety policies, and exports reproducible artifacts — all without modifying the user's original working tree.

Capability Tiers

Tier Name What You Get
0 Process Wrapper Container isolation, artifact recording, patch export, resource monitoring, best-effort HTML reports
1 Structured Events Import agent tool-call traces for process quality metrics
2 Enforcement (preview only) Pre-execution policy checks on individual tool calls (Bash/Edit/Write) via Claude Code hook adapter

Current status: v0.1.1-alpha. Tier 0 alpha verified. Claude Code Tier 2 hook adapter module is preview only; the v0.1 CLI run path uses the command adapter.

5-Minute Demo

# 1. Install
git clone https://github.com/ZedingZhang/repoairlock.git && cd repoairlock
python3.12 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"

# 2. Check your environment
repoairlock doctor

# 3. Run an agent in the sandbox
repoairlock run \
  --repo examples/demo-repo \
  --image alpine:latest \
  -- sh -c "echo 'print(\"Hello, RepoAirlock!\")' > /workspace/hello.py"

# 4. Inspect the results
repoairlock list
repoairlock inspect <run-id>

# 5. Replay the patch (no agent re-invocation)
repoairlock replay <run-id> --repo examples/demo-repo

# 6. Compare two runs
repoairlock compare <run-a> <run-b>

# 7. View the HTML report, if report generation succeeded
open ~/.repoairlock/runs/<run-id>/report.html

Safety Guarantees

  • Agent never runs in the user's original working tree (INV-001)
  • Containers run without network and without Docker privileged mode by default; all Linux capabilities are dropped except DAC_OVERRIDE, which is retained so root inside the container can write to the bind-mounted workspace
  • Environment variables are injected via explicit allowlist — never the full host environment
  • Every sandbox execution attempt produces core auditable artifacts (manifest, events, logs, patch) even on failure; JSON/HTML report generation is best-effort
  • Original workspace fingerprints are verified before and after every run
  • Patch integrity verified via SHA-256 before replay
  • Sandbox-configuration policy enforcement rejects dangerous Docker parameters at construction time
  • Command-level enforcement is only active when the Tier 2 Claude Code hook adapter is used

Explicit Non-Guarantees

RepoAirlock does not and cannot guarantee:

  • Prevention of container escape (this is a property of the Docker runtime)
  • Observability of all agent internal actions at Tier 0
  • Protection against actively malicious code inside the container
  • Absolute isolation (Docker is a process-level boundary, not a hardware boundary)
  • Network egress filtering beyond on/off at Tier 0

RepoAirlock is a safety harness, not a security sandbox. When you explicitly enable network access (--network bridge), the agent can make outbound connections.

Architecture

flowchart TD
    cli["repoairlock CLI<br/>run / inspect / replay / compare / list / doctor"]
    cfg["RunConfig<br/>created by run"]
    orchestrator["RunOrchestrator<br/>validate / run / finalize"]
    policy["PolicyEngine + command_builder<br/>sandbox configuration checks"]
    workspace["WorkspaceManager<br/>detached worktree + source fingerprint"]
    docker["DockerBackend<br/>safe Docker args + resource stats"]
    artifacts["ArtifactStore + EventRecorder<br/>manifest / events.jsonl / logs / patch / integrity"]
    report["ReportGenerator<br/>report.json + report.html"]
    replay["ReplayService + CompareService<br/>read run artifacts"]

    cli --> cfg --> orchestrator
    orchestrator --> policy
    orchestrator --> workspace
    workspace -->|isolated worktree| docker
    docker -->|stdout / stderr / exit / patch| artifacts
    artifacts --> report
    artifacts --> replay

The default v0.1 run path executes the user-provided command through the Docker backend. The Claude Code adapter is a preview hook integration, not a separate stage after sandbox execution.

Demo Screenshots

HTML report

RepoAirlock demo HTML report screenshot

Audit trail

RepoAirlock demo audit trail screenshot

CLI output

RepoAirlock demo CLI output screenshot

Example Report

When best-effort report generation succeeds, repoairlock run --repo examples/demo-repo --image alpine -- sh -c "..." produces an HTML report with these sections:

  1. Run Summary — status, wall time, exit code, HEAD SHA
  2. Safety Posture — network mode, privileged status, env allowlist, INV-001
  3. Repository Change Summary — files/lines changed, sensitive path detection
  4. Verification Result — verifier exit code (if configured)
  5. Resource Usage — peak memory, avg CPU, peak PIDs, network I/O
  6. Quality & Policy Findings — capability tier notice, safety findings
  7. Artifact Integrity — SHA-256 hashes
  8. Replay Instructions — exact command to reproduce

Every report explicitly states the capability tier and what conclusions cannot be drawn at that tier.

Supported Adapters

Adapter Tier Description
command 0 Any CLI agent — wraps command as-is for sandbox execution
claude_code 2 (preview only) Claude Code hook adapter module — PreToolUse policy enforcement + PostToolUse recording; CLI wiring is not exposed in v0.1

Development Roadmap

Phase Status Deliverable
0 Done Repository scaffolding + constraints
1 Done Artifact store + event logging
2 Done WorkspaceManager + source repo protection
3 Done DockerSandbox + doctor
4 Done CommandAdapter + first complete run
5 Done Inspect, replay, compare
6 Done Metrics + HTML report
7 Done Policy engine (12 default rules)
8 Done Claude Code Tier 2 hook adapter module
9 In progress Stabilization + public release

See docs/progress.md for details.

Limitations

  • Platform: Linux-first. macOS via Docker Desktop is supported but resource limits, filesystem semantics, and performance differ. CI targets Linux.
  • Windows: No native Windows support in v0.1.
  • Tier 0 visibility: Cannot observe agent internal tool calls, LLM token usage, or per-command reasoning. The HTML report explicitly states this.
  • Network filtering: Only on/off (none/bridge). No domain allowlisting.
  • Docker dependency: Requires Docker daemon. Podman/buildah not yet supported.
  • PostToolUse testing: The Claude Code Tier 2 adapter's PostToolUse recording requires a real Claude Code binary for full end-to-end testing.

Running Tests

pip install -e ".[dev]"
ruff check .
mypy src
pytest -q                    # unit + integration (skips Docker tests without daemon)
pytest -q tests/e2e          # e2e tests (requires Docker)

Attribution

RepoAirlock is a safety harness for coding agents. It does not perform code generation, LLM inference, or autonomous repair. It does not guarantee "complete security" or "absolute isolation." It provides a structured, auditable, and reproducible execution environment so that agent actions can be reviewed, replayed, and compared.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

repoairlock-0.1.1a0.tar.gz (49.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

repoairlock-0.1.1a0-py3-none-any.whl (59.6 kB view details)

Uploaded Python 3

File details

Details for the file repoairlock-0.1.1a0.tar.gz.

File metadata

  • Download URL: repoairlock-0.1.1a0.tar.gz
  • Upload date:
  • Size: 49.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for repoairlock-0.1.1a0.tar.gz
Algorithm Hash digest
SHA256 6b091a6fd94f86d9d33b0e45af5797313df449c02aa96b69ca670512ff16dabe
MD5 38d9c6fdd39e7444f786908a83d1c494
BLAKE2b-256 5894e8aded2466daccac390edc67f6c00648ec36641aeac01eeff7169df2be69

See more details on using hashes here.

Provenance

The following attestation bundles were made for repoairlock-0.1.1a0.tar.gz:

Publisher: release.yml on ZedingZhang/repoairlock

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file repoairlock-0.1.1a0-py3-none-any.whl.

File metadata

  • Download URL: repoairlock-0.1.1a0-py3-none-any.whl
  • Upload date:
  • Size: 59.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for repoairlock-0.1.1a0-py3-none-any.whl
Algorithm Hash digest
SHA256 65f431087a875511c601f6105dc98b7807774516d7e57a4a7154b3d6969d0870
MD5 8de847e4b9850a698b3fb04ae3770c50
BLAKE2b-256 43aeea03dbff6306e6baa2601818af474907617f1d42b4d424d448fc7a46136b

See more details on using hashes here.

Provenance

The following attestation bundles were made for repoairlock-0.1.1a0-py3-none-any.whl:

Publisher: release.yml on ZedingZhang/repoairlock

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page