Deterministic state oracle and semantic action codec for computer-use agents

These details have not been verified by PyPI

Project links

Project description

groundcrew

Deterministic state oracle and semantic action codec for computer-use agents.

groundcrew

Quick Start · How It Works · CLI Reference · GitHub Action · vs. Alternatives · Contributing

Why

Computer-use agents act on real software: they write files, call APIs, run scripts. But how do you know what they actually did vs. what they were supposed to do?

Screenshot-based LLM judges give you a visual approximation at best. They miss side effects — the extra file written, the config silently overwritten, the database row changed. And they cannot replay, diff, or audit what happened.

groundcrew inverts the architecture: instead of watching from the outside, it snapshots the filesystem before and after every action and produces a content-addressed ActionReceipt — a tamper-evident record of exactly what changed. No guessing. No LLM judge. Just a deterministic diff.

groundcrew capture --root . --verb write --target config.json --run "agent.py"
# → ActionReceipt: 3 files added, 1 modified, diff stored in .groundcrew/receipts.db

How It Works

flowchart LR
    A[Agent declares\nActionSpec\nverb · target · params] --> B[Oracle captures\nStateSnapshot BEFORE\nSHA-256 of file tree]
    B --> C[Agent runs\nthe action]
    C --> D[Oracle captures\nStateSnapshot AFTER]
    D --> E[SnapshotDiff\nadded · removed · modified]
    E --> F[ActionReceipt\nspec + before_id + after_id + diff]
    F --> G[ReceiptStore\nSQLite persistence]

Core primitives:

FileState — a content-addressed snapshot of a single file: path, size, SHA-256.
StateSnapshot — a content-addressed snapshot of a directory tree. ID = SHA-256[:16] of sorted file states.
SnapshotDiff — the structural delta between two snapshots: added, removed, modified files. .added and .removed are list[FileState]; each element has a .path attribute (relative path string). .modified is list[tuple[FileState, FileState]] (before, after).
ActionSpec — a semantic, content-addressed action description: (verb, target, params). ID = SHA-256[:16] of the spec. The same action on the same target always produces the same ID.
ActionReceipt — binds an ActionSpec to a before-snapshot ID, after-snapshot ID, and SnapshotDiff. Stored permanently as an audit trail.
ReceiptStore — SQLite-backed store. Save receipts, retrieve by ID, list history.

Snapshots are computed by walking the directory tree with os.walk, hashing each file with SHA-256, and content-addressing the whole collection. This is purely Python standard-library code — no kernel hooks, no elevated privileges, no platform-specific APIs required.

Features

Feature	Details
Content-addressed snapshots	Same file tree always produces the same snapshot ID
Deterministic diffs	Added, removed, and modified files — no approximation
Semantic action codec	`ActionSpec` is portable, content-addressed, version-robust
Tamper-evident receipts	`ActionReceipt` binds intent to effect, stored permanently
SQLite receipt store	Single-file persistence, no server, works offline
Rich terminal output	Color diff tables, receipt summaries
JSON output	Machine-readable for downstream automation
Markdown output	Ready-to-paste audit reports
FastAPI REST server	`/capture`, `/receipt/{id}`, `/receipts`, `/diff/{id}`
MCP server	Model Context Protocol integration for Claude and other agents
91 tests	Comprehensive test suite covering all layers

Quick Start

pip install groundcrew            # core library + CLI
pip install "groundcrew[api]"     # + FastAPI REST server
pip install "groundcrew[mcp]"     # + MCP server

import json, pathlib, tempfile
from groundcrew import Oracle, ActionSpec, ReceiptStore

# Use a temporary directory so nothing is left behind in your cwd
with tempfile.TemporaryDirectory() as tmpdir:
    # Declare what you're about to do
    spec = ActionSpec(verb="write", target="config.json", params={"key": "value"})

    # Capture before/after state around the action
    with Oracle(tmpdir, spec) as oracle:
        pathlib.Path(tmpdir, "config.json").write_text(json.dumps({"key": "value"}))

    receipt = oracle.record(spec)
    print(receipt.diff.changed_paths)   # {'config.json'}
    print(receipt.id)                   # content-addressed ID

    # Persist for auditing (db lives inside the temp dir — cleaned up automatically)
    store = ReceiptStore(f"{tmpdir}/.groundcrew/receipts.db")
    store.save(receipt)
    # tmpdir and all its contents are removed when the `with` block exits

CLI Reference

groundcrew [--db PATH] COMMAND [OPTIONS]

Command	Description	Key options
`capture`	Snapshot before/after a shell command	`--root DIR`, `--verb VERB`, `--target TARGET`, `--run CMD`
`diff RECEIPT_ID`	Show the SnapshotDiff for a stored receipt	—
`log`	List all stored receipts	—
`status`	Show database info	—
`watch DIRECTORY`	Watch a directory for unexpected changes	`--interval SECS`, `--max-checks N`, `--allow PATH`

Global options:

Option	Default	Env var
`--db PATH`	`.groundcrew/receipts.db`	`GROUNDCREW_DB`

Examples:

# Capture what an agent script does to the current directory
groundcrew capture --root . --verb run --target agent.py --run "python agent.py"

# Show what changed
groundcrew diff abc123de

# List all receipts
groundcrew log

# Status
groundcrew status

GitHub Action

Add groundcrew auditing to your CI pipeline:

# .github/workflows/groundcrew.yml
name: groundcrew audit
on: [push, pull_request]

jobs:
  audit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: sandeep-alluru/groundcrew@main
        with:
          root: .
          db: .groundcrew/receipts.db

vs. Alternatives

	groundcrew	Screenshot judges	AgentSight	OSWorld verifiers
Verification method	Filesystem diff	Vision LLM	eBPF syscall trace	Per-task custom code
Deterministic	Yes — content-addressed	No — probabilistic	Partial	Yes (per app)
No per-app code	Yes	Yes	Yes	No — 33 apps manually
Production runtime	Yes	Yes	Linux-only	VM/sandbox only
Audit trail	SQLite receipts	None	Log files	None
Action codec	Portable ActionSpec	None	None	None
Open source	MIT	N/A	MIT	Research
Python package	Yes	N/A	No	No

groundcrew is not a replacement for security-layer tools like AgentSight. It is specifically designed for agent developers who need a simple, deterministic record of what their agent changed on disk — suitable for testing, auditing, and CI/CD gating.

Claude / MCP integration

groundcrew ships a Model Context Protocol server that lets Claude and other MCP-compatible agents record and query action receipts directly:

# Start the MCP server
groundcrew-mcp

# In your Claude Code project's .claude/settings.json:
{
  "mcpServers": {
    "groundcrew": {
      "command": "groundcrew-mcp"
    }
  }
}

Once connected, Claude can call groundcrew/capture_state, groundcrew/get_receipt, and groundcrew/list_receipts as tools. See docs/mcp.md for the full tool schema.

OpenAI integration

groundcrew exposes a FastAPI REST server compatible with OpenAI's function-calling format. The tool definitions are in tools/openai-tools.json and the full API spec is in openapi.yaml.

# Start the REST server
uvicorn groundcrew.api:app --reload

# Pass to Codex CLI or any OpenAI-compatible agent
codex --tools tools/openai-tools.json "Capture what this script does to the filesystem"

Endpoints: GET /health, POST /capture, GET /receipt/{id}, GET /receipts, GET /diff/{id}. See docs/openai.md for details.

Case Studies

See how teams are using groundcrew in production:

Repository structure

groundcrew/
├── src/
│   └── groundcrew/
│       ├── snapshot.py       # FileState, StateSnapshot, SnapshotDiff
│       ├── codec.py          # ActionSpec, ActionReceipt (content-addressed)
│       ├── oracle.py         # Oracle context manager, capture(), ReceiptStore
│       ├── report.py         # print_receipt(), print_diff(), to_json(), to_markdown()
│       ├── cli.py            # Click CLI (capture, diff, log, status)
│       ├── api.py            # FastAPI REST server
│       └── mcp_server.py     # MCP server
├── tests/
│   ├── test_snapshot.py      # StateSnapshot, SnapshotDiff unit tests
│   ├── test_codec.py         # ActionSpec, ActionReceipt unit tests
│   ├── test_oracle.py        # Oracle context manager, ReceiptStore tests
│   ├── test_report.py        # Formatter tests
│   ├── test_cli.py           # CLI subprocess integration tests
│   ├── test_cli_runner.py    # Click CliRunner tests
│   └── test_api.py           # FastAPI TestClient tests
├── examples/
│   └── demo.py               # Standalone demo script
├── docs/                     # MkDocs documentation
├── tools/
│   └── openai-tools.json     # OpenAI function-calling tool definitions
├── assets/
│   ├── hero.png              # README hero image
│   └── logo.png              # Project logo
├── action.yml                # GitHub Action
├── openapi.yaml              # OpenAPI 3.1 spec
├── pyproject.toml            # Package metadata + dependencies
└── CONTRIBUTING.md           # Contribution guide

GitHub Topics

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.3

Jun 24, 2026

0.1.2

Jun 21, 2026

This version

0.1.1

Jun 20, 2026

0.1.0

Jun 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

groundcrew-0.1.1.tar.gz (1.5 MB view details)

Uploaded Jun 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

groundcrew-0.1.1-py3-none-any.whl (24.0 kB view details)

Uploaded Jun 20, 2026 Python 3

File details

Details for the file groundcrew-0.1.1.tar.gz.

File metadata

Download URL: groundcrew-0.1.1.tar.gz
Upload date: Jun 20, 2026
Size: 1.5 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for groundcrew-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`883e4545cdde315d71ec5511a73665247595c93d7fdaa412ba58b34d7a9887d0`
MD5	`84b17ee37c71b30fa3f7a4dcdaf1ecce`
BLAKE2b-256	`8a96255a98ea5d1e6c75b132348ff8da7f5866d444f7ac5aebd6bb542ba1afbb`

See more details on using hashes here.

File details

Details for the file groundcrew-0.1.1-py3-none-any.whl.

File metadata

Download URL: groundcrew-0.1.1-py3-none-any.whl
Upload date: Jun 20, 2026
Size: 24.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for groundcrew-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0aebe06ad2c6db79ca77f04580d0601fa94d3db4fb690bd74e3c69d4f6e6390f`
MD5	`c8dc16284675dcf076210f6e895946e3`
BLAKE2b-256	`97c8f7b55ea46dc75f30bdf5cfc7554727e8e13874db73842953881eef6b70e0`

See more details on using hashes here.

groundcrew 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

groundcrew

Why

How It Works

Features

Quick Start

CLI Reference

GitHub Action

vs. Alternatives

Claude / MCP integration

OpenAI integration

Case Studies

Repository structure

GitHub Topics

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes