The control layer for AI coding agents.

These details have not been verified by PyPI

Project links

Project description

OpenShard

The control layer for AI coding agents.

AI coding agents can write code, but engineering teams still need to understand what ran, what changed, what context was used, whether checks passed, what it cost and a way of proving it. OpenShard wraps AI coding agent runs with routing, review boundaries, checks, evidence, cost tracking, evals, feedback, and durable Shard receipts.

Agents write code. OpenShard controls the run and proves what happened.

Why OpenShard exists

AI coding agents are becoming good enough to work on real repos, infrastructure, and production-shaped systems.

That creates a new problem. Not “can the model write code?” but:

Which model or workflow handled the task?
What files did it inspect?
What did it change?
Did checks pass, fail, skip, or not run?
What did the run cost?
Was anything risky gated or reviewed?
Is there a durable receipt of what happened?

OpenShard is built for the work around the agent: routing, verification, policy, evidence, cost awareness, and auditability. The valuable unit is not a single model call. It is a completed engineering task with evidence, checks, cost, and a receipt.

What OpenShard does

OpenShard is a CLI tool for controlling and recording AI coding agent runs.

It can:

Run real repo tasks through a controlled execution path
Route work across models and workflows where available
Classify task risk
Gate risky writes and commands
Record model used, risk, checks, changed files, evidence, cost, and result
Produce durable Shard receipts for every run
Support read-only review flows that preserve Changed 0 files
Provide workflow packs for repeatable engineering reviews
Compare models and workflows through local evals
Track feedback and session signals around runs

OpenShard is not trying to replace Claude Code, Codex, Cursor, OpenCode, or other coding agents.

Those tools do the coding work.

OpenShard sits around them as the control and audit layer.

Current developer loop

The current local developer loop is:

Ask -> Plan -> Run -> Inspect -> Feedback

Ask
Ask OpenShard product, model, command, and policy questions.

Plan
Generate a local execution plan. Plan Mode v1 is deterministic and local: it does not scan the repo, call a provider, or write files.

Run
Send a real repo task through OpenShard’s controlled execution path.

Inspect
Review the result, actions taken, evidence, checks, changed files, cost estimate, model choice, and Shard receipt.

Feedback
Record whether the result was accepted, partial, rejected, or needs more work.

Quick install

Recommended: pipx

pipx install git+https://github.com/MichaelObasa/openshard.git

Run:

openshard tui

Alternative: uv

uv tool install git+https://github.com/MichaelObasa/openshard.git

Local development:

git clone https://github.com/MichaelObasa/openshard.git
cd openshard
pip install -e .

See docs/install.md for upgrade instructions and notes.

Quick demo

OpenShard Demo

Launch the TUI:

openshard tui

Inside the TUI:

/ask what models do you support?
/plan review this repo for production readiness
/packs
/pack production-iac-hardening

Run a real repo task:

Review and harden this deliberately flawed Terraform codebase. Assess it through security/compliance posture, 2am operability, and developer experience for a 5-10 person engineering team. Identify critical, high, and medium risks. Explain trade-offs. Do not apply changes directly without review.

Inspect the latest run:

/last more

Or from the shell:

openshard last --more

OpenShard Last --more

Leave feedback:

openshard feedback --outcome accepted --note "Useful review"

Production IaC demo

The examples/production-infra-demo/ directory contains a fictional GCP workload called DocuVault — a sanitised demo scenario for OpenShard.

The infrastructure is intentionally production-shaped: networking, IAM, Cloud SQL, Cloud Run, storage, secrets, monitoring, and logging.

It is deliberately flawed to serve as the input for an infrastructure-as-code hardening review.

All names, project IDs, resource IDs, CIDRs, and accounts are fake and public-safe. No employer or customer details. Designed to show a serious IaC review, not a toy example.

See:

A typical production IaC review can show:

Critical, high, and medium findings
File-level evidence such as iam.tf, secrets.tf, database.tf, network.tf, and storage.tf
Verification output from tools like terraform fmt, terraform validate, and tflint when available
A clear Changed 0 files receipt for read-only reviews
Model selection and cost tracking
A /last more view with the full Shard, findings, checks, evidence, and cost comparison

This is the core OpenShard use case: let AI help with serious engineering work, but keep the control, evidence, and receipt layer visible.

Shard receipts

A Shard is the durable receipt for an AI engineering run.

It can show:

Task and agent
Model used
Strategy
Risk level
Context provenance
Inspected files
Changed and touched files
Checks and their outcomes
Findings, when structured findings exist
Cost
Actions timeline
Result

OpenShard can also record feedback and infer session signals around a run.

openshard last --more    # expanded receipt for the latest run
openshard last --full    # full stored details

Raw developer content is not stored by default.

One run, end to end

A normal OpenShard run can capture:

Task - the user request or workflow pack prompt.
Routing - which model or workflow was selected.
Risk - whether the task is low, medium, high, or requires stronger review.
Execution - what the agent did during the run.
Checks - verification results, including passed, failed, skipped, or not run.
Evidence - files inspected, findings, and relevant source references.
Changes - files changed, touched, or left untouched.
Cost - estimated spend for the run.
Receipt - a durable Shard record that can be inspected later.

The point is simple: every AI coding run should leave behind enough evidence for a developer or team to understand what happened.

How OpenShard is different

OpenShard is not a chatbot, IDE, or even a generic agent framework. It's the layer around agentic coding work.

Layer	What it does
Coding agent	Generates code, edits files, answers task prompts
Model router	Chooses which model or workflow should handle the job
Verification layer	Runs checks and records whether they passed, failed, skipped, or were not run
Policy layer	Gates risky writes, commands, and high-risk work
Receipt layer	Records model, cost, evidence, checks, changed files, and result
Eval layer	Compares models and workflows by outcome, cost, speed, and safety

OpenShard can work alongside tools like Claude Code, Codex, Cursor, OpenCode, LangChain, LangGraph, OpenRouter, and provider APIs.

The goal is not to replace every coding agent. The goal is to make AI coding work controllable, inspectable, and measurable.

Workflow packs

Workflow packs are pre-built prompts for repeatable engineering reviews.

openshard packs list
openshard packs show production-iac-hardening
openshard packs prompt production-iac-hardening

Built-in packs include:

repo-explanation
production-iac-hardening
terraform-networking-review
iam-security-review
cicd-safety-review
powershell-automation-review

Workflow packs make common review patterns repeatable without forcing users to rewrite long prompts every time.

Command reference

Most developers should start with the TUI:

openshard tui                                      # Launch the OpenShard terminal UI

Run tasks:

openshard run "Review this repo for risks"         # Run a task through OpenShard from the shell
openshard run --workflow native "Fix this bug"     # Run using the native workflow path

Inspect the latest run:

openshard last                                     # Show the latest run summary
openshard last --more                              # Show the expanded Shard receipt
openshard last --full                              # Show full stored/debug details

Record feedback:

openshard feedback --outcome accepted              # Mark the latest run as accepted
openshard feedback --outcome partial               # Mark the latest run as partly useful
openshard feedback --outcome rejected              # Mark the latest run as not useful
openshard feedback --outcome needs_work            # Mark the latest run as needing more work

Infer local session signals:

openshard session infer                            # Infer local behavioural/session signals from run history

Workflow packs:

openshard packs list                               # List available workflow packs
openshard packs show production-iac-hardening      # Show details for a workflow pack
openshard packs prompt production-iac-hardening    # Print the pack prompt

Model registry and policy:

openshard models list                              # List registered models
openshard models role reasoning                    # Show reasoning-capable models
openshard models role cheap_control                # Show low-cost/control models
openshard models mode ask                          # Show Ask Mode model policy
openshard models mode plan                         # Show Plan Mode model policy

Local evals:

openshard eval list                                # List eval suites
openshard eval validate --suite basic              # Validate an eval suite
openshard eval run --suite basic                   # Run an eval suite
openshard eval report                              # Show latest eval report
openshard eval compare                             # Compare models by eval results
openshard eval stats                               # Show eval stats

Useful TUI commands:

/ask what models do you support?                   # Ask OpenShard product/model questions
/plan review this repo for production readiness    # Generate a local plan without writing files
/packs                                             # List workflow packs inside the TUI
/pack production-iac-hardening                     # Load a workflow pack inside the TUI
/last                                              # Show the latest run
/last more                                         # Show expanded run details
/last full                                         # Show full debug/audit details
/feedback accepted                                 # Record feedback for the latest run
/clear                                             # Clear the output panel
/quit                                              # Exit the TUI

What works today

OpenShard is still alpha, but the core local loop is working.

Current features include:

Local CLI and TUI (openshard tui)
Ask Mode for local product/model/command Q&A
Plan Mode v1 for deterministic local plans
Controlled run path for real repo tasks
OpenShard Native execution harness
Task classification and risk handling
Model registry and model policy inspection
Routing across models/workflows where available
Shard receipts with model, risk, files, checks, cost, evidence, and result
/last, /last more, and /last --full
Read-only review handling that preserves Changed 0 files
Intent-specific review handling for Terraform/IaC, CI/CD, auth/security, tests, and docs/onboarding
Workflow packs for repeatable engineering reviews
Feedback signals
Session signal inference
Local run history
Local eval harness
Eval comparison by pass rate and cost-per-pass
Cost comparison in /last more
Production-shaped Terraform demo
5,500+ passing tests and green CI

What is not built yet

OpenShard is early and intentionally local-first.

Not built yet:

No hosted team platform yet
No cloud sync yet
No hosted dashboard for teams yet
No IDE integration yet
No PyPI or Homebrew release yet — install from GitHub
Plan Mode is not repo-aware yet
Ask Mode and Plan Mode are local deterministic v1 flows
Feedback advisory does not automatically change routing yet
External harness adapters are experimental and not guaranteed
Not a full Claude Code, Codex, or Cursor replacement

Developer setup

git clone https://github.com/MichaelObasa/openshard.git
cd openshard
pip install -e .
python -m pytest -q
python -m ruff check .

Advanced: evals

OpenShard includes a local eval harness for checking routing and workflow behaviour.

openshard eval list
openshard eval validate --suite basic
openshard eval run --suite basic
openshard eval report
openshard eval compare
openshard eval stats

The goal is not just to ask “which model is best?”

The better question is:

Which model or workflow succeeds most reliably for this type of task, at what cost, with what safety profile?

The eval system can track:

Pass rate
Verification outcomes
Duration
Token usage where available
Cost where available
Cost per passing run
Unsafe file attempts
Model ranking across eval runs

This is the foundation for smarter routing over time: routing based on actual task outcomes.

Current validation state

OpenShard is still early, but it is not just a prototype.

Current validation includes:

5,500+ passing tests
Green CI
Ruff-clean Python codebase
Local CLI/TUI workflow
Production-shaped Terraform demo
Workflow packs for repeatable reviews
Shard receipts for run history
Eval tooling for model and workflow comparison
Pre-launch usage from developers testing it on real work

The project is alpha, but the core loop is working:

Run the task -> inspect what happened -> verify the output -> create a receipt

Roadmap

Near-term roadmap:

Public open-source launch
More real-world developer testing
Better repo-aware planning
Stronger model/workflow ranking from real outcomes
More workflow packs
More repo analyzers for common stacks
Cleaner setup and release packaging
Hosted/team run history
Team policies and shared approval gates
Dashboards for cost, model usage, and verification outcomes

Longer-term, OpenShard should become the control plane teams use to manage AI engineering work.

Why open source?

Routing decisions should be inspectable.

If a tool decides which model touches security-sensitive code, developers should be able to see why.

OpenShard is open because trust, integrations, and routing policies improve when real users can inspect and extend the system.

Open source also keeps the local-first layer useful on its own. Hosted and team features can come later, but the core control layer should be understandable and inspectable.

Contributing

Contributions are welcome around:

Routing policies and scoring logic
Repo analyzers for new stacks
Model profiles and capability data
Evaluation datasets
Provider integrations
Workflow packs
CLI/TUI UX improvements
Documentation and examples

See CONTRIBUTING.md for details.

Security

If you find a security issue, please report it privately before opening a public issue.

See SECURITY.md.

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.2

Jun 1, 2026

This version

0.1.0

May 26, 2026

0.1.0a1 pre-release

May 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openshard-0.1.0.tar.gz (553.6 kB view details)

Uploaded May 26, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

openshard-0.1.0-py3-none-any.whl (259.1 kB view details)

Uploaded May 26, 2026 Python 3

File details

Details for the file openshard-0.1.0.tar.gz.

File metadata

Download URL: openshard-0.1.0.tar.gz
Upload date: May 26, 2026
Size: 553.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for openshard-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`cbcaa2b084581f96be84a07f3f2b4b33b8044d8d1841ffdc34899c97dd9c3884`
MD5	`8b8e5d652d79f669d2fdc866f1660832`
BLAKE2b-256	`c0841d7b67de836401707d77181acf23ddb272ed13a5f58adf2f878c505389ac`

See more details on using hashes here.

File details

Details for the file openshard-0.1.0-py3-none-any.whl.

File metadata

Download URL: openshard-0.1.0-py3-none-any.whl
Upload date: May 26, 2026
Size: 259.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for openshard-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`078f3883addeb86e9fabe9c87f3ac150e1ccedc60aa5578f3f63a579c1f83f76`
MD5	`5ab6394630c481f9b3b78b5f3bc1344f`
BLAKE2b-256	`462fd5208d6cd5f0733490b129f871c1717774c252f24dbb87c2f75e6cb255d1`

See more details on using hashes here.

openshard 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

OpenShard

Why OpenShard exists

What OpenShard does

Current developer loop

Quick install

Quick demo

Production IaC demo

Shard receipts

One run, end to end

How OpenShard is different

Workflow packs

Command reference

What works today

What is not built yet

Developer setup

Advanced: evals

Current validation state

Roadmap

Why open source?

Contributing

Security

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes