Skip to main content

CLI for Invarum, the Governance-grade LLM Quality Engineering platform.

Project description

โšก Invarum CLI

Prompt it. Measure it. Fix it. Prove it. Bring certainty to LLM qualityโ€”and evidence.

The Invarum CLI is a thin, fast client for the Invarum Cloud Engine. Use it to run quantitative LLM evaluations, generate audit-ready evidence bundles, and enforce policy gates in CI/CDโ€”without leaving the command line.

Get started: You need an Invarum account and API key. Sign up at app.invarum.com.


๐Ÿ“ฆ Features

1) Headless Invarum Engine

Submit prompts to the Invarum Cloud, where theyโ€™re evaluated with the deterministic 4D Energy Model.

  • Live status: stream progress and view the final response in your terminal
  • Scoring: get immediate ฮฑ / ฮฒ / ฮณ / ฮด scores in a readable table

2) Audit-Ready Evidence

Export forensic artifacts for any runโ€”ready to attach to an incident review or internal audit packet.

  • JSON evidence bundle: machine-readable export containing scores, policy outcomes, metadata, and SHA-256 integrity hashes
  • PDF report: download a formatted audit report via the CLI

3) CI/CD Gating

Stop bad prompts from reaching production.

  • Use --strict to return exit code 1 when a run fails policy gates
  • Ideal for GitHub Actions, GitLab CI, and regression test suites

4) Enterprise Observability (OTel)

Invarum is OpenTelemetry (OTel) native.

  • Each run can emit standard OTel traces
  • Connect Datadog, Honeycomb, or New Relic to view quality signals alongside operational telemetry

โš›๏ธ The Invarum Engine

Unlike โ€œLLM-as-a-judgeโ€ tools that depend on subjective model opinions, Invarum evaluates outputs using a deterministic pipeline and returns repeatable scores, policy gate decisions, and audit-ready evidence bundles suitable for incident review and internal governance.

The 4D Energy Model

We measure LLM behavior along four orthogonal axes:

Metric Signal What it Measures
ฮฑ TaskScore Task alignment Did the output follow the request and constraints (format, requirements, and reference match when provided)?
ฮฒ Coherence Semantic continuity Did the response stay on-trackโ€”logically consistent, well-structured, and free of drift or contradiction?
ฮณ Entropy / Order Variance & determinism Is output variability appropriate for the domain and task (stable for scientific/legal; broader for creative/brainstorming)?
ฮด Efficiency Cost-to-value How much useful information was delivered per token (and time), relative to the expected structure and verbosity?

The physics analogy is intentional: scores behave like measurable state variables, and policy gates define what โ€œstableโ€ looks like for a given domain.

Policy-as-Code Gating

Runs are evaluated against a selected Policy Profile (internal governance by default). The engine returns:

  • Gate results (must-pass requirements and scored thresholds)
  • An overall verdict plus an explicit decision state: pass / pass_with_advisory / fail_with_advisory / fail
  • Structured advisories with recommended remediation steps

Security & Privacy

Invarum is designed for auditability without unnecessary data retention:

  1. BYOK: your LLM API keys are encrypted at rest and never exposed in plaintext.
  2. Configurable I/O retention: prompts and responses can be stored temporarily for debugging or minimized/redacted depending on workspace policy.
  3. Immutable evidence: evidence bundles retain SHA-256 hashes and run metadata for integrity verificationโ€”even when raw text retention is minimized.

๐Ÿš€ Installation

Install directly via pip:

pip install git+https://github.com/Invarum/invarum-cli.git@v0.1.5

Requires Python 3.9+


โšก Quickstart

1) Get an API Key

Log in to the dashboard: Settings โ†’ Developer Access Keys.

2) Authenticate

Save your key locally. This persists until you revoke it.

invarum login --key inv_sk_your_secret_key_here

3) Run an Evaluation

invarum run "Summarize the main findings of this abstract in 5 bullets." --domain scientific

Example Output:

Running evaluation...
Run ID: run_a1b2c3d4

โ•ญโ”€ LLM Response โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ 1. The study establishes a correlation between...   โ”‚
โ”‚ 2. Methodology involved a double-blind trial...     โ”‚
โ”‚ ...                                                 โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ Metric             โ”ƒ Score โ”ƒ
โ”กโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”ฉ
โ”‚ Alpha (Task)       โ”‚ 0.892 โ”‚
โ”‚ Beta (Coherence)   โ”‚ 0.910 โ”‚
โ”‚ Gamma (Entropy)    โ”‚ 0.450 โ”‚
โ”‚ Delta (Efficiency) โ”‚ 0.780 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Decision: PASS_WITH_ADVISORY
Policy Profile: internal_governance_default
View details: https://app.invarum.com/runs/run_a1b2c3d4

Tip: Open โ€œView detailsโ€ to inspect diagnostics, sensitivity analysis, and operator traces in the dashboard.


๐Ÿ›  Advanced Usage

Reference-Based Grading

Provide a gold-standard answer to enable higher-fidelity grading when appropriate.

invarum run "Explain quantum entanglement" --reference "Quantum entanglement is a phenomenon where..."

Load from files:

invarum run -f prompt.txt --reference-file ground_truth.txt

Task, Domain, and Generation Overrides

Help classification or tune generation.

# Specify task and domain
invarum run "extract dates from this contract" --task extract --domain legal

# Override model temperature
invarum run "Write a creative poem" --temp 0.9

Export Evidence (Incident Review / Audit Packet)

# Export JSON evidence bundle
invarum export run_a1b2c3d4 --format json --output evidence.json

# Export formatted PDF audit report
invarum export run_a1b2c3d4 --format pdf --output report.pdf

CI/CD Integration

The CLI supports environment variables for automation.

export INVARUM_API_KEY="inv_sk_..."

# --strict forces a non-zero exit code on policy failure
invarum run -f prompt.txt --strict --json > results.json

๐Ÿง  Architecture

Invarum uses a thin client architecture:

  1. CLI (this repo): auth, file IO, request formatting, and rendering. No proprietary scoring logic runs locally.
  2. Cloud engine: prompts are evaluated by the PBPEF pipeline, producing scores, policy outcomes, traces, and evidence artifacts.
[CLI] โ†’ [API Gateway] โ†’ [PBPEF Pipeline] โ†’ [Run Record + Evidence]
  โ†‘                                                  โ†“
  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ summarized results โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

โ“ Troubleshooting

"Command not found" after installation?

If you ran pip install but typing invarum gives an error, your computer's Python script directory might not be in your system PATH.

You can fix this by adding the path to your environment variables, OR simply run the tool using python -m:

python -m invarum login
python -m invarum run "Test prompt"

---
## ๐Ÿ”ฌ Roadmap

**MVP (Live Now):**

* [x] Cloud-based energy scoring (ฮฑ/ฮฒ/ฮณ/ฮด)
* [x] Policy gating & exit codes
* [x] Web dashboard sync
* [x] Evidence export (JSON & PDF)

**Coming Soon:**

* [ ] Batch processing (CSV input)
* [ ] `invarum check` regression suites
* [ ] Automated drift detection between runs

---

## ๐Ÿง‘โ€๐Ÿ”ฌ Author

**Lucretius Coleman**
PhD in Physics | Computational Methods | Quantum Systems & Prompt Engineering
[lacolem1@invarum.com](mailto:lacolem1@invarum.com)

---

## ๐Ÿ“„ License

MIT โ€” see `LICENSE`.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

invarum-0.1.6.tar.gz (13.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

invarum-0.1.6-py3-none-any.whl (10.7 kB view details)

Uploaded Python 3

File details

Details for the file invarum-0.1.6.tar.gz.

File metadata

  • Download URL: invarum-0.1.6.tar.gz
  • Upload date:
  • Size: 13.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for invarum-0.1.6.tar.gz
Algorithm Hash digest
SHA256 dbcefc424792e216e8afa855b2e66ad98a6a236a5f9821f548516c9ad5570655
MD5 4de1677c7e19821af7c3984a33e55a23
BLAKE2b-256 464f38556b6389fb7060ad11ccd466d5db9210de4dcaf9f92128b64e6e052982

See more details on using hashes here.

File details

Details for the file invarum-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: invarum-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 10.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for invarum-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 cf626a0cbb5a9b3a1d4178beb03283e4f7553424c6933c1e584141c5884c1689
MD5 07b870cffe59f61744d9a300957e4d87
BLAKE2b-256 79b3e5b6ced8432c74956d6c45ca238cf51b54e9bae28cd0a9870cb5f053bdc5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page