Skip to main content

CLI for Invarum, the Governance-grade LLM Quality Engineering platform.

Project description

โšก Invarum CLI

Prompt it. Measure it. Fix it. Prove it. Bring certainty to LLM qualityโ€”and evidence.

The Invarum CLI is a thin, fast client for the Invarum Cloud Engine. Use it to run quantitative LLM evaluations, generate audit-ready evidence bundles, and enforce policy gates in CI/CDโ€”without leaving the command line.

Get started: You need an Invarum account and API key. Sign up at app.invarum.com.


๐Ÿ“ฆ Features

1) Headless Invarum Engine

Submit prompts to the Invarum Cloud, where theyโ€™re evaluated with the deterministic 4D Energy Model.

  • Live status: stream progress and view the final response in your terminal
  • Scoring: get immediate ฮฑ / ฮฒ / ฮณ / ฮด scores in a readable table

2) Audit-Ready Evidence

Export forensic artifacts for any runโ€”ready to attach to an incident review or internal audit packet.

  • JSON evidence bundle: machine-readable export containing scores, policy outcomes, metadata, and SHA-256 integrity hashes
  • PDF report: download a formatted audit report via the CLI

3) CI/CD Gating

Stop bad prompts from reaching production.

  • Use --strict to return exit code 1 when a run fails policy gates
  • Ideal for GitHub Actions, GitLab CI, and regression test suites

4) Enterprise Observability (OTel)

Invarum is OpenTelemetry (OTel) native.

  • Each run can emit standard OTel traces
  • Connect Datadog, Honeycomb, or New Relic to view quality signals alongside operational telemetry

โš›๏ธ The Invarum Engine

Unlike โ€œLLM-as-a-judgeโ€ tools that depend on subjective model opinions, Invarum evaluates outputs using a deterministic pipeline and returns repeatable scores, policy gate decisions, and audit-ready evidence bundles suitable for incident review and internal governance.

The 4D Energy Model

We measure LLM behavior along four orthogonal axes:

Metric Signal What it Measures
ฮฑ TaskScore Task alignment Did the output follow the request and constraints (format, requirements, and reference match when provided)?
ฮฒ Coherence Semantic continuity Did the response stay on-trackโ€”logically consistent, well-structured, and free of drift or contradiction?
ฮณ Entropy / Order Variance & determinism Is output variability appropriate for the domain and task (stable for scientific/legal; broader for creative/brainstorming)?
ฮด Efficiency Cost-to-value How much useful information was delivered per token (and time), relative to the expected structure and verbosity?

The physics analogy is intentional: scores behave like measurable state variables, and policy gates define what โ€œstableโ€ looks like for a given domain.

Policy-as-Code Gating

Runs are evaluated against a selected Policy Profile (internal governance by default). The engine returns:

  • Gate results (must-pass requirements and scored thresholds)
  • An overall verdict plus an explicit decision state: pass / pass_with_advisory / fail_with_advisory / fail
  • Structured advisories with recommended remediation steps

Security & Privacy

Invarum is designed for auditability without unnecessary data retention:

  1. BYOK: your LLM API keys are encrypted at rest and never exposed in plaintext.
  2. Configurable I/O retention: prompts and responses can be stored temporarily for debugging or minimized/redacted depending on workspace policy.
  3. Immutable evidence: evidence bundles retain SHA-256 hashes and run metadata for integrity verificationโ€”even when raw text retention is minimized.

๐Ÿš€ Installation

Install directly via pip:

pip install git+https://github.com/Invarum/invarum-cli.git@v0.1.6

Requires Python 3.9+


โšก Quickstart

1) Get an API Key

Log in to the dashboard: Settings โ†’ Developer Access Keys.

2) Authenticate

Save your key locally. This persists until you revoke it.

invarum login --key inv_sk_your_secret_key_here

3) Run an Evaluation

invarum run "Summarize the main findings of this abstract in 5 bullets." --domain scientific

Example Output:

Running evaluation...
Run ID: run_a1b2c3d4

โ•ญโ”€ LLM Response โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ 1. The study establishes a correlation between...   โ”‚
โ”‚ 2. Methodology involved a double-blind trial...     โ”‚
โ”‚ ...                                                 โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”“
โ”ƒ Metric               โ”ƒ Score โ”ƒ
โ”กโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”ฉ
โ”‚ Alpha (Task)         โ”‚ 0.892 โ”‚
โ”‚ Beta (Coherence)     โ”‚ 0.910 โ”‚
โ”‚ Gamma (Order/Entropy)โ”‚ 0.450 โ”‚
โ”‚ Delta (Efficiency)   โ”‚ 0.780 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Decision: PASS_WITH_ADVISORY
Policy Profile: internal_governance_default
View details: https://app.invarum.com/runs/run_a1b2c3d4

Tip: Open โ€œView detailsโ€ to inspect diagnostics, sensitivity analysis, and operator traces in the dashboard.


๐Ÿ›  Advanced Usage

Reference-Based Grading

Provide a gold-standard answer to enable higher-fidelity grading when appropriate.

invarum run "Explain quantum entanglement" --reference "Quantum entanglement is a phenomenon where..."

Load from files:

invarum run -f prompt.txt --reference-file ground_truth.txt

Task, Domain, and Generation Overrides

Help classification or tune generation.

# Specify task and domain
invarum run "extract dates from this contract" --task extract --domain legal

# Override model temperature
invarum run "Write a creative poem" --temp 0.9

Export Evidence (Incident Review / Audit Packet)

# Export JSON evidence bundle
invarum export run_a1b2c3d4 --format json --output evidence.json

# Export formatted PDF audit report
invarum export run_a1b2c3d4 --format pdf --output report.pdf

CI/CD Integration

The CLI supports environment variables for automation.

export INVARUM_API_KEY="inv_sk_..."

# --strict forces a non-zero exit code on policy failure
invarum run -f prompt.txt --strict --json > results.json

๐Ÿง  Architecture

Invarum uses a thin client architecture:

  1. CLI (this repo): auth, file IO, request formatting, and rendering. No proprietary scoring logic runs locally.
  2. Cloud engine: prompts are evaluated by the PBPEF pipeline, producing scores, policy outcomes, traces, and evidence artifacts.
[CLI] โ†’ [API Gateway] โ†’ [PBPEF Pipeline] โ†’ [Run Record + Evidence]
  โ†‘                                                  โ†“
  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ summarized results โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

โ“ Troubleshooting

"Command not found" after installation?

If you ran pip install but typing invarum gives an error, your computer's Python script directory might not be in your system PATH.

You can fix this by adding the path to your environment variables, OR simply run the tool using python -m:

python -m invarum login
python -m invarum run "Test prompt"

๐Ÿ”ฌ Roadmap

MVP (Live Now):

  • Cloud-based energy scoring (ฮฑ/ฮฒ/ฮณ/ฮด)
  • Policy gating & exit codes
  • Web dashboard sync
  • Evidence export (JSON & PDF)

Coming Soon:

  • Batch processing (CSV input)
  • invarum check regression suites
  • Automated drift detection between runs

๐Ÿง‘โ€๐Ÿ”ฌ Author

Lucretius Coleman PhD in Physics | Computational Methods | Quantum Systems & Prompt Engineering lacolem1@invarum.com


๐Ÿ“„ License

MIT โ€” see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

invarum-0.1.7.tar.gz (13.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

invarum-0.1.7-py3-none-any.whl (10.8 kB view details)

Uploaded Python 3

File details

Details for the file invarum-0.1.7.tar.gz.

File metadata

  • Download URL: invarum-0.1.7.tar.gz
  • Upload date:
  • Size: 13.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for invarum-0.1.7.tar.gz
Algorithm Hash digest
SHA256 0678908cc4f20e76cf6f273f08d26e1981c576d491305d15f60bc0fa0c8673e7
MD5 39ed4063fbf84c4214b43d614ae8ea93
BLAKE2b-256 b532f7a3d1b3ea7a4371248c4bf83cfabb60374b8e95eac489f988e2b4f1d776

See more details on using hashes here.

File details

Details for the file invarum-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: invarum-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 10.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for invarum-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 86d055cb573b9b4059c3d242337f6eaa63d3b55722873b63288984aee18e728b
MD5 101d671711d741644698f6ff0690f9cb
BLAKE2b-256 a082cdff886c4956d9d97758ef5fc299aca61745add7ae7b1c540a09776093aa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page