Reliability for LLM agents through enforcement, not model size — Agent-Contract-Kernel (ACK) + a fail-closed orchestration engine.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

MJWC-AI-LAB

These details have not been verified by PyPI

Project description

Ironclad

Python Status

Reliability for LLM agents through enforcement, not model size. 🇦🇪 Built in the UAE by MJWC-AI-LAB.

Ironclad is a generic, model-agnostic framework for building reliable agentic systems. It pairs an Agent-Contract-Kernel (ACK) — schema-as-single-source- of-truth, validate→reask→retry, a generator and a preflight doctor — with a lean orchestration engine that turns multi-step agent workflows into deterministic, fail-closed pipelines.

The guiding principle: a small fast model with hard schema/validation enforcement beats a large model you "trust" to format its output. You get production-grade tool-calling reliability without depending on any specific model or parser — the kernel enforces the contract, not the weights.

🚧 Status: proven core, mid-redesign (pre-release)

Ironclad's engine comes from a proven, in-production orchestrator, but it is currently undergoing a complete redesign (single process → headless server + thin client, containerized, reasoning-worker fan-out, full-screen TUI). Not everything is re-wired or re-tested yet, and some features are still placeholders while the rebuild finishes (notably memory and autoplan). There is no tagged release; APIs, layout and config may change. Treat main as a development snapshot.

👉 docs/status.md is the honest, per-component wiring status (proven / wired+tested / placeholder / opt-in), the module reference, the memory situation, and the latest load-test results. Read it before relying on anything.

Reference environment. Ironclad is developed and exercised on an NVIDIA DGX Spark (GB10, Blackwell sm_121, 128 GB unified memory) running a local vLLM server with Qwen3.6-35B-A3B-NVFP4. Nothing is hard-wired to that box — any OpenAI-compatible endpoint works — but the defaults (localhost:8000, qwen3.6-35b), the throughput numbers and the constrained-decoding findings (NVFP4, XGrammar on the CUDA 13 nightly) reflect that hardware. See docs/dgx-spark.md for the full reference stack and a one-shot bootstrap (scripts/spark-bootstrap.sh).

Why

Contract-first. One Pydantic schema drives the prompt, the validator, the docs, and (where the hardware allows) constrained decoding. No drift between what you ask for and what you check.
Fail-closed pipelines. Macro steps (e.g. task hand-off, advancement) do the mechanical file work deterministically in code, not in model turns — fewer round-trips, no silent half-completions.
Model-agnostic. Swap the orchestrator model freely; reliability comes from the kernel, not the weights.
Standalone. No hidden dependency on any private deployment. Bring your own OpenAI-compatible endpoint (vLLM, etc.).

Demo

The full-screen client over the server/client split — a turn streams live, the toolbar shows live status (model · throughput · tasks · watcher). Illustrative transcript from a real session:

[You] > what is 17 times 23?
  [Qwen (planning)]
  17 times 23 is 391.
  [perf] TTFT 0.5s · 183 tok/2.9s = 64 tok/s · prompt 1739
  ======== ✓ DONE · ready · 1 gen · 3s · 183 tok ========
──────────────────────────────────────────────────────────────────────
│ [You] >
──────────────────────────────────────────────────────────────────────
 ██ Ironclad  powered by MJWC-AI-LAB
 ██  Orchestrator client · streaming   |   /help · exit · PageUp=history
     qwen3.6-35b · 64 tok/s · tasks 0P/0IP/0D · http://<server>:8100

Reply language is a setting (GX10_LANGUAGE — en default, ar, fr, …). The model answers in the configured language regardless of the input language. Real output with GX10_LANGUAGE=ar, same question:

[You] > what is 17 times 23?
  حاصل ضرب 17 في 23 هو 391.
  ======== ✓ DONE · ready · 1 gen · 2s ========

/command routing (local + forwarded), scrollback (PageUp/PageDown) and compressed multi-line paste are built in. There's also a plain line REPL and a monolithic CLI.

Benchmarks

Measured on the reference stack — a single NVIDIA DGX Spark (GB10) serving Qwen3.6-35B-A3B-NVFP4 via vLLM, driven over the LAN:

Workload	Result
Reasoning fan-out, 8 independent prompts	5.8× faster than serial (1.2 s vs 7.1 s), ~118 tok/s aggregate
Conversational turn (single agent)	~55–68 tok/s, ~2.1 s mean latency
Structured emission (ACK, thinking-off)	100% schema-valid in earlier measurement

Reproduce with your own endpoint; numbers scale with the model and GPU. Full method and the per-component wiring status live in docs/status.md.

A reliability layer, not another model

Ironclad doesn't compete with the open models — it makes them dependable to build on. It's model-agnostic by design: reliability comes from the contract kernel (schema → validate → reask), not the weights. So it's a natural agent/reliability layer for regional open models like Falcon, Jais and K2 Think — point it at any of them (running on other models) and get fail-closed pipelines, structured tool-calls and a thin local client, without forking or retraining anything.

A starting point to build on

Ironclad is a foundation, not a finished product — a generic agentic core meant to be extended. Concrete use cases dock onto it as vessels (see examples/demo-vessel/; a generator scaffolds new ones), so a broad audience can build their own domain agents on a reliable, self-hosted base rather than starting from scratch. Realistic directions the architecture supports today:

Edge & energy efficiency — a small, enforced model on local/edge hardware instead of a large cloud one. That efficiency bet is the whole premise of the project.
Education · healthcare · logistics — build a vessel for your domain: reliable tool-using agents and retrieval/RAG assistants over your own data, kept on-prem.

The repo's job is to give you a working starting point, not to ship every vertical — the verticals are yours to build.

Setup

Requires Python 3.10+ and an OpenAI-compatible endpoint (e.g. vLLM).

git clone https://github.com/GrokBuildMJW/ironclad.git
cd ironclad
python -m venv .venv && . .venv/bin/activate     # Windows: .venv\Scripts\Activate.ps1
pip install -e ".[engine]"                         # ACK + engine (openai, prompt_toolkit)

# Point at your model endpoint (defaults: http://localhost:8000/v1, qwen3.6-35b):
export GX10_BASE_URL=http://localhost:8000/v1
export GX10_MODEL=your-served-model-name
export GX10_API_KEY=...                             # only if your endpoint needs one

# Monolithic full-screen CLI (one process):
python engine/gx10.py --workdir ./my-workspace

Full walkthrough, the server/client split, and the reference vLLM launch: see SETUP.md — including copy-paste shell shortcuts (so you can just type ironclad) for Windows PowerShell, macOS and Linux.
Let an AI coding agent set it up for you (deterministic, verifiable runbook): see AGENTS.md.

A runnable demo vessel lives in examples/demo-vessel/ — a minimal, self-contained workspace showing a contract spec, a pipeline, and the doctor preflight. Real vessels stay in the operators' own private repos.

Layout

core/
  ack/                 # Agent-Contract-Kernel: case-spec, validated-emit, registry, doctor, generator
  engine/              # orchestration engine: agent loop, task store, fail-closed macros
  examples/demo-vessel # runnable example workspace
  LICENSE  NOTICE      # Apache-2.0

Roadmap

Honest near-term plan (the rebuild's placeholders are now wired — see docs/status.md for the full per-component status):

Broaden test coverage and harden the new server/client paths.
One-command compose for model + orchestrator + optional memory ships now (docker compose --profile model --profile memory up).
First tagged release once the APIs settle.

Sovereign AI / local deployments. Ironclad is model-agnostic and fully self-hostable — it talks to any OpenAI-compatible endpoint, so it already runs against locally-served open models (e.g. Falcon, Jais, K2 Think via vLLM — see running on other models) with no cloud dependency and data kept on your own infrastructure. On the roadmap: verified config recipes for those models, retrieval/RAG over local datasets through the memory hook, and on-prem agent templates for enterprise/government use cases. These are integration directions the architecture already supports, not shipped features yet.

Issues and discussions are welcome — this is an early, openly-developed project.

License

🇦🇪 Built in the United Arab Emirates by MJWC-AI-LAB.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

MJWC-AI-LAB

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.0.1

Jun 17, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ironclad_ai-0.0.1.tar.gz (64.2 kB view details)

Uploaded Jun 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ironclad_ai-0.0.1-py3-none-any.whl (70.1 kB view details)

Uploaded Jun 17, 2026 Python 3

File details

Details for the file ironclad_ai-0.0.1.tar.gz.

File metadata

Download URL: ironclad_ai-0.0.1.tar.gz
Upload date: Jun 17, 2026
Size: 64.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ironclad_ai-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`a13fb75b366d10ab809f06f26f50aede63a5fdda2705e22b209544bd13b1388f`
MD5	`0920ce1b35a802983da6c48564027d97`
BLAKE2b-256	`c3a33c0320337b918fb6fa491c8066f90ccc7ccf044be64f25227437a5c02c86`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ironclad_ai-0.0.1.tar.gz:

Publisher: publish.yml on GrokBuildMJW/ironclad

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ironclad_ai-0.0.1.tar.gz
- Subject digest: a13fb75b366d10ab809f06f26f50aede63a5fdda2705e22b209544bd13b1388f
- Sigstore transparency entry: 1843793672
- Sigstore integration time: Jun 17, 2026
Source repository:
- Permalink: GrokBuildMJW/ironclad@4649fbaa799d999dee17da221dd6263c7fa55f21
- Branch / Tag: refs/heads/main
- Owner: https://github.com/GrokBuildMJW
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@4649fbaa799d999dee17da221dd6263c7fa55f21
- Trigger Event: workflow_dispatch

File details

Details for the file ironclad_ai-0.0.1-py3-none-any.whl.

File metadata

Download URL: ironclad_ai-0.0.1-py3-none-any.whl
Upload date: Jun 17, 2026
Size: 70.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ironclad_ai-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0f5cc4327a5d782b4f618c1f1d885ac632253952c4df17c7fbecadf29702519f`
MD5	`5c1913749aaff4754c3288f02abb6c09`
BLAKE2b-256	`5b712614073fe0f9d645a9631316266fb6787578ce803f1ac5363a8963f0888c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ironclad_ai-0.0.1-py3-none-any.whl:

Publisher: publish.yml on GrokBuildMJW/ironclad

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ironclad_ai-0.0.1-py3-none-any.whl
- Subject digest: 0f5cc4327a5d782b4f618c1f1d885ac632253952c4df17c7fbecadf29702519f
- Sigstore transparency entry: 1843793986
- Sigstore integration time: Jun 17, 2026
Source repository:
- Permalink: GrokBuildMJW/ironclad@4649fbaa799d999dee17da221dd6263c7fa55f21
- Branch / Tag: refs/heads/main
- Owner: https://github.com/GrokBuildMJW
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@4649fbaa799d999dee17da221dd6263c7fa55f21
- Trigger Event: workflow_dispatch

ironclad-ai 0.0.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Ironclad

🚧 Status: proven core, mid-redesign (pre-release)

Why

Demo

Benchmarks

A reliability layer, not another model

A starting point to build on

Setup

Layout

Roadmap

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance