Runtime intelligence layer for AI-era Python software: observe, verify, explain, and gate releases on evidence.
Project description
Barx — Runtime Intelligence Layer for AI-Era Python Software
Everything your code did, explained through evidence.
Barx observes Python code while it runs, verifies behavior, explains every runtime decision with evidence, audits what AI coding agents change, and turns a run into a GREEN/AMBER/RED release verdict you can defend — locally, with zero telemetry.
AI is changing how code gets written. Barx focuses on what comes next: runtime trust — can this run ship, and what proves it?
Barx 1.0.0 is the first stable release of the Barx Runtime Intelligence Layer. Barx has been rebuilt and repositioned: the 2025 PyPI release (0.1.0, "Fast, CPU-only AI framework") was a different product and is fully retired — see the changelog. Everything in the claims registry is implemented and tested; the limits are documented in What Barx is not and docs/CLAIMS.md.
Barx Studio: the local evidence workspace (barx studio). A real
recorded run — GREEN verdict, score and confidence, evidence categories,
and the Evidence Spine. Graphite Dark
ships too.
What Barx helps answer
- Can this run ship? →
barx release-check(GREEN / AMBER / RED with evidence) - What changed between two runs? →
barx drift - What failed, and where? → the Risks view in Studio, the report's exceptions section
- What did the coding agent do to my repo? →
barx.AgentAudit - What evidence supports this release? → the HTML report and Barx Studio
Quickstart
pip install barx # core; add barx[api] for API testing
Until the 1.0.0 wheel lands on PyPI (publishing is a deliberate manual step — docs/publishing.md), install from source:
pip install git+https://github.com/TheBarmaEffect/barx. The 0.1.0 currently on PyPI is the retired 2025 package, not this product.
import barx
seen = barx.Collection() # starts as a list
seen.extend(f"url-{i}" for i in range(250))
for i in range(2000): # workload turns lookup-heavy...
seen.contains(f"url-{i % 250}")
print(seen.backend()) # -> "set" (switched, with evidence)
print(seen.explain()) # why, evidence, alternatives, confidence, rollback
Every decision is a structured event in .barx/runs/<run_id>/events.jsonl.
Nothing leaves your machine — no telemetry, no network calls.
barx trace your_script.py # record runtime spans
barx verify . # behavioral checks + static risk scan
barx release-check # GREEN/AMBER/RED verdict (RED exits 1)
barx report --html report.html # one self-contained evidence artifact
barx studio # local-only visual workspace at 127.0.0.1
See it
Every image is a real screenshot of real recorded runs
(scripts/make_showcase_runs.py + scripts/capture_screenshots.mjs),
not a mockup. The "viewer python" shown in the capsule is the Studio
process's interpreter (3.14 on the capture machine) — a labeled viewer
fact; the library itself is tested on 3.10–3.12.
Architecture
Barx is strictly layered around one rule — no event, no product. Every feature writes structured events to an append-only JSONL store; explanations, reports, scores, verdicts, and Studio are renderings of those events, never recomputations.
flowchart LR
M[Instrumentation:<br/>Trace · Verify · API · Guard ·<br/>Adaptive · AI Runtime · AgentAudit] -->|events| S[(events.jsonl<br/>per run)]
S --> R[build_report] --> H[HTML / JSON report]
R --> U[Barx Studio]
S --> G[Score → ReleaseGate] -->|GREEN / AMBER / RED| C[CI · PR comment · exit code]
The full map — layer contracts, the event schema, Guard's single-patch seam model, the privacy table, and extension points — is in docs/architecture.md.
Core concepts
- Run — one instrumented execution, stored under
.barx/runs/<id>/. - Event — a structured record with evidence; everything Barx shows is rendered from events, never invented.
- Evidence — the event ids behind every claim, score, and verdict.
- Report —
build_reportoutput, served as JSON or one self-contained HTML file. The portable artifact. - Studio — a local-only viewer over that same report data.
- ReleaseGate — documented GREEN/AMBER/RED rules over the evidence.
Main capabilities
- Trace — function spans, nesting, boundary exception capture; no argument values captured. (docs)
- Verify — behavioral verification over your cases + a 20-rule AST risk scan (no code execution). (docs)
- API — API testing with runtime evidence; auth/tokens redacted.
Optional
barx[api]extra. (docs) - Policy / Guard — runtime guardrails (observe/warn/strict) via reversible patch seams. Not a sandbox. (guard)
- Drift / Replay — compare two runs (comparative, not causal); replay GET-only and dry-run by default. (drift, replay)
- Score / ReleaseGate — evidence-backed score and verdict with stated formulas. (score, gate)
- Adaptive runtime — Collection, Cache, Router, Pipeline: evidence- backed, explainable, overridable switching. (collection)
- AI runtime — LLMTrace (prompts/responses hashed by default), PromptGuard (heuristic), Cost (estimates from your price table). (llm)
- AgentAudit — observable evidence of what an AI coding agent did to a repo. (docs)
- Evidence Testing — Mock (recorded replay), Contract (schema-lite), AutoTest (generated skeletons). (mock)
- Studio — local visual workspace. (docs)
- GitHub Action / VS Code MVP — Barx in PRs, CI, and the editor. (action, vscode)
Full module index: docs/README.md.
What Barx is not
- Not a sandbox. Guard patches documented seams; it is not isolation.
- Not formal verification. Verify runs real cases and flags risks with evidence; it does not prove the absence of bugs.
- Not a cloud observability platform. Barx is local-first; nothing is uploaded.
- Not a Postman replacement. API testing brings runtime evidence to Python; it is not a full API client.
- Not an LLM provider or client. LLMTrace wraps your callables; Barx makes no provider calls and ships no provider SDKs.
- Not a guarantee of safety, correct billing, or coverage. PromptGuard is heuristic, Cost is an estimate, AutoTest generates starting points, and a GREEN gate means "no configured blockers in the available evidence" — not proof.
Privacy & security
- Local-first. No telemetry, no hidden network calls, local storage only.
- Prompts/responses hashed by default (SHA-256); raw capture is an explicit opt-in that still redacts secrets.
- Secrets redacted across events, reports, and fixtures (auth headers, tokens, cookies, api keys, passwords).
- Studio binds
127.0.0.1by default with no telemetry and no external assets.
What works today
Every row below is implemented and tested. This table is the claims registry — Barx advertises nothing before it works. The detailed, categorized version with allowed/forbidden wording lives in docs/CLAIMS.md.
| Area | Status |
|---|---|
| Structured event system (stable schema, JSONL store, corrupt-line recovery) | ✅ tested |
Runtime manager (fail-soft by default, strict mode opt-in, BARX_DISABLED) |
✅ tested |
barx.Collection — adaptive backends: list, set, deque, heap, sorted (value mode) and dict (key-value mode) |
✅ tested |
| Collection strategy engine — thresholds + hysteresis + cooldown, confidence with stated formula, alternatives with rejection reasons, conversion-cost estimates | ✅ tested |
Collection safety — lock_backend, data-preserving rollback (refused with a recorded warning if it would lose data), duplicate/unhashable/uncomparable fallbacks |
✅ tested |
pop_min() / iter_sorted() on every value backend (cost varies by backend) |
✅ tested |
| Explain engine — evidence-backed answers to what/why/evidence/alternatives/confidence/rollback/override, per collection instance | ✅ tested |
barx.Trace — function spans (no args/values captured), nesting, exception capture at the trace boundary, include/exclude filters, sampling, max_depth/max_events, fail-soft |
✅ tested |
Trace ↔ Collection linkage — adaptive decisions during a trace are counted, listed, and linked via related_event_ids |
✅ tested |
| JSON reports (with a trace section: summaries, slowest spans, exceptions) | ✅ tested |
| HTML report — one self-contained file (inline CSS, no JS frameworks, no CDN, offline); explain-style decision cards, trace summary, CSS span timeline, event feed, raw evidence anchors, honest empty states and caps; escaped + secret-redacted; JSON fallback on failure | ✅ tested |
barx.verify — behavioral verification: real cases, expected/contract checks, exception capture, latency + stability checks, type-hint warnings, redacted evidence |
✅ tested |
barx.verify_file / verify_project — AST risk scan (20 rules, critical→low), file:line evidence, no code execution |
✅ tested |
| Verification events + explain support + report sections (JSON and HTML) + stated confidence heuristic | ✅ tested |
barx.API / barx.APISuite — API testing with runtime evidence (optional barx[api] extra): status/latency/header/JSON-path/schema-lite assertions, token capture + chaining, fail-fast or continue, declarative JSON specs |
✅ tested |
| API privacy — auth headers, cookies, and token-like values redacted in all stored evidence; no raw-secret flag exists | ✅ tested |
barx.Policy / barx.Guard — runtime guardrails (not a sandbox): 10 active rules, observe/warn/strict modes, reversible patch seams always restored (incl. before strict violations propagate), allow_network/allow_file_delete approval contexts, latency budget |
✅ tested |
| Policy events + explain + report sections; evidence redacted; stdlib-internal eval/exec exempted (documented); barx.API runner never falsely flagged | ✅ tested |
barx.Graph — project graph (best-effort AST structure: imports, classes, inheritance, local calls), runtime graph (evidence-backed from events; no invented links), failure graph (event-supported chains); JSON + Mermaid-text exports, caps with disclosure |
✅ tested |
barx.Drift — compare two runs across 7 categories with stated thresholds, evidence event ids, improvement findings, and zero causal language (test-enforced) |
✅ tested |
barx.Replay — dry-run by default, GET-only by default, status-parity assertions, disclosed skips, shell/eval/file/pickle/policy actions never replayed; evidence-based path reconstruction |
✅ tested |
barx.Score — evidence-backed trust score (formula v1.0 stated in every result: weights, penalty table, evidence ids, limitations; no score without evidence) |
✅ tested |
barx.ReleaseGate — documented GREEN/AMBER/RED rules (v1.0), release confidence with stated formula, blockers/warnings with evidence, insufficient evidence → AMBER never GREEN, PR-comment markdown |
✅ tested |
barx.Cache — adaptive caching (lru/lfu/ttl/fifo/no_cache/auto) with evidence-backed strategy switches, decorator, bypass disclosure, injectable clock, RLock |
✅ tested |
barx.Router — measured-evidence routing (fixed/round_robin/fastest/lowest_error/auto), fair warmup, disclosed fallback, exceptions never swallowed |
✅ tested |
barx.Pipeline — environment detection via find_spec only (heavy frameworks never imported) + honest workflow recommendations with limitations |
✅ tested |
barx.LLMTrace — callable wrapper (no provider SDKs, no provider calls): prompts/responses as SHA-256 hashes by default, tokens only when supplied, redacted opt-in capture |
✅ tested |
barx.PromptGuard — heuristic output validation (JSON, schema-lite, unsafe commands, secret leakage, undeclared tools, injection markers); observe/warn/strict |
✅ tested |
barx.Cost — estimates from user-supplied price tables only; missing prices/tokens disclosed, never assumed |
✅ tested |
| AI Runtime score dimension (only when LLM events exist) + gate rules + restrained report section with privacy note | ✅ tested |
barx.AgentAudit — agent-session evidence: before/after snapshots (hash/metadata, contents never stored), dependency diffs, commands/network via Guard's seams, policy links, timeline |
✅ tested |
barx.Mock — redacted replay fixtures from recorded evidence (X-Barx-Mock: recorded; misses disclosed, never invented; refuses without evidence) |
✅ tested |
barx.Contract — schema-lite contracts from observed responses; drift = review finding, never a breakage claim |
✅ tested |
barx.AutoTest — pytest skeletons generated from evidence (review-required banner, deterministic, nothing invented) |
✅ tested |
Barx Studio — local-only run viewer (barx studio): 127.0.0.1, zero telemetry, no external assets, viewer-not-source-of-truth |
✅ tested |
Local benchmarks (benchmarks/) incl. honest overhead numbers for Collection and Trace |
✅ tested |
GitHub Action (barx-release-check) — composite action: verdict, report, fail-on, PR comment via token; shells out to the CLI, no duplicated gate logic |
✅ tested |
CI workflow (ci.yml) — Python 3.10/3.11/3.12 matrix, ruff + format gates, coverage ≥90% gate |
✅ tested |
VS Code MVP (vscode/barx) — status bar + commands, shells out to CLI only, no telemetry/cloud/chat |
✅ tested |
CLI: version, runs, latest, explain, report, trace, verify, api test, policy, guard, graph, drift, replay, score, release-check, pipeline, llm, cost, prompt-guard, agent-audit, mock, contract, autotest, studio, ci comment — --json where applicable |
✅ tested |
Limitations
Barx shows what the evidence holds; absent evidence renders as an empty state, never a guess. Guard is not isolation. Drift is comparative, not causal. PromptGuard, Score, and the gate are documented heuristics, not proofs. Cost is an estimate from your price table. AgentAudit cannot see inside child processes. AutoTest output requires human review. Supported on Python 3.10–3.12 (3.13/3.14 are unverified). The full list lives in docs/AUDIT.md and each module's doc.
Principles
- No event, no product. Explanations and reports are rendered from recorded events, never invented.
- Fail soft. Instrumentation failures never break your program unless you opt into strict mode.
- No magic. Every adaptive switch is loggable, explainable, rollback- able, and overridable.
- Private by default. No telemetry, no hidden network calls, local storage only.
Development
python -m venv .venv && .venv/bin/pip install -e ".[dev]"
.venv/bin/pytest --cov=barx # full suite, coverage ≥ 90%
.venv/bin/ruff check barx tests && .venv/bin/ruff format --check barx tests
python scripts/launch_smoke.py # end-to-end launch smoke
python scripts/run_examples.py # run all safe examples
Docs index: docs/README.md · Architecture: docs/architecture.md · Website: docs/website.md · Claims: docs/CLAIMS.md · Changelog: CHANGELOG.md · Roadmap: docs/ROADMAP.md
License
MIT © 2026 Karthik Barma. See LICENSE. Built under the Aura banner.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file barx-1.0.0.tar.gz.
File metadata
- Download URL: barx-1.0.0.tar.gz
- Upload date:
- Size: 284.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8906b1c1570cb47ec268240d2ad9eea30bb2fd564ec9c49248eea2bae7800481
|
|
| MD5 |
f13120db8b815922c07fc48761e18df4
|
|
| BLAKE2b-256 |
3e1a28d3ef515f3ccae61116ada96137b6e93d53e4b8106d5344a1a861d30c82
|
Provenance
The following attestation bundles were made for barx-1.0.0.tar.gz:
Publisher:
workflow.yml on TheBarmaEffect/Barx
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
barx-1.0.0.tar.gz -
Subject digest:
8906b1c1570cb47ec268240d2ad9eea30bb2fd564ec9c49248eea2bae7800481 - Sigstore transparency entry: 1808253532
- Sigstore integration time:
-
Permalink:
TheBarmaEffect/Barx@ca6047c94181b3e0ad712ed832d94776d1220559 -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/TheBarmaEffect
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@ca6047c94181b3e0ad712ed832d94776d1220559 -
Trigger Event:
push
-
Statement type:
File details
Details for the file barx-1.0.0-py3-none-any.whl.
File metadata
- Download URL: barx-1.0.0-py3-none-any.whl
- Upload date:
- Size: 228.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b0c2e0b2a93d52116d58de36991284c5a41f49b29beb53f62cea2f4b5748ecce
|
|
| MD5 |
6d192859775bc12c240de8aae0f5934d
|
|
| BLAKE2b-256 |
d147b5394bd3a531fa0395a803848c7204da319a489fcc7cd3fce50c28b799b8
|
Provenance
The following attestation bundles were made for barx-1.0.0-py3-none-any.whl:
Publisher:
workflow.yml on TheBarmaEffect/Barx
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
barx-1.0.0-py3-none-any.whl -
Subject digest:
b0c2e0b2a93d52116d58de36991284c5a41f49b29beb53f62cea2f4b5748ecce - Sigstore transparency entry: 1808253564
- Sigstore integration time:
-
Permalink:
TheBarmaEffect/Barx@ca6047c94181b3e0ad712ed832d94776d1220559 -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/TheBarmaEffect
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@ca6047c94181b3e0ad712ed832d94776d1220559 -
Trigger Event:
push
-
Statement type: