Project-aware consolidated testing CLI for pytest, sandbox, empirical, and retained aggregate validation lanes.
Project description
Calamum Test
Calamum Test is a standalone, project-aware testing harness for consolidating pytest, sandbox_test, and empirical_test lanes behind one retained-evidence CLI and Python facade.
It is built for teams that want test execution, project context, and generated evidence to stay organized, reviewable, and reproducible.
Public repository: https://github.com/joediggidyyy/calamum
Install and verify
Install from PyPI using the published distribution name:
pip install calamum-test
Run the installed CLI using the runtime command name:
calamum --versioncalamum -h
Important naming note:
- PyPI package / dependency name:
calamum-test - import package and runtime command:
calamum
Package highlights include:
- a shared
.calamum/project.jsondescriptor model - project-aware resolution for catalog, runs, and reports roots
- a stable Python facade in
calamum.api - tracked catalog metadata for profiles, tags, policy flags, and evidence requirements
- retained run manifests, checksums, and per-step output captures
- aggregate report generation for
job,project, anddomainscopes - optional signing inputs for privileged or publication-oriented report flows
Command surface
Top-level families
calamum test— validation definition discovery, execution, and retained evidence reviewcalamum project— project registration, active-project state, and context readbackcalamum monitor— current monitor-shell scaffolding and capability readback
Test execution
calamum test listcalamum test show <definition_id>calamum test run <definition_id>calamum test runs listcalamum test runs show <run_id>
A definition_id is the exact id of a test definition in the catalog. Use calamum test list to discover available ids, then pass one of those ids to calamum test show or calamum test run.
The catalog currently includes definitions such as:
seed-cli-smokeseed-adversarial-smoke
Long-running subprocess steps emit heartbeat lines to stderr while a run is active. Those heartbeats are transient status signals; the retained artifacts are written when the run completes.
Project management
calamum project registercalamum project set <project>calamum project currentcalamum project validate [<project>]calamum project listcalamum project show <project>
Compatibility note:
calamum test project ...remains available as a compatibility alias for the top-levelprojectfamily.
Monitor scaffolding
calamum monitor capability list
The monitor family currently provides capability inspection through calamum monitor capability list.
Aggregate reporting
calamum test reports listcalamum test reports show <report_ref>calamum test reports generate --scope job --job <job_id>calamum test reports generate --scope project [--project <project>]calamum test reports generate --scope domain --domain <domain>
Adding tests to the library
In Calamum, the public test library is the tracked catalog at catalog/test_definitions.json.
To add a definition:
- add a new definition object to the catalog
definitionslist - give it a stable
id,title,summary,status, andcategory - classify it with:
profiles— reusable bundles such assmoke,release, ornightlytags— cross-cutting labels for search and future selectorspolicy_flags— execution or governance rulesevidence_requirements— retained outputs the definition must produce
- declare step arrays under the canonical lanes:
pytestsandbox_testempirical_test
- validate the new entry by running:
calamum test listcalamum test show <definition_id>calamum test run <definition_id> --dry-run
Minimal definition shape:
{
"id": "adversarial-auth-smoke",
"title": "Adversarial auth smoke",
"summary": "Challenge the authentication path with hostile-input and retained-evidence checks.",
"status": "active",
"category": "adversarial",
"profiles": ["smoke", "release"],
"tags": ["adversarial", "auth", "api", "signing"],
"policy_flags": ["containment", "json-first", "project-aware", "release-gate"],
"evidence_requirements": ["report_json", "report_md", "manifest_json", "checksums_json"],
"default_lanes": ["pytest", "sandbox_test"],
"lanes": {
"pytest": [],
"sandbox_test": [],
"empirical_test": []
}
}
Definitions are currently added directly in catalog/test_definitions.json, which remains the authoritative editing surface.
What the fields mean
category= what kind of test this isprofiles= when or why you run ittags= what area it touchespolicy_flags= special execution or governance rulesevidence_requirements= which retained outputs must exist after the runpytest/sandbox_test/empirical_test= the three execution lane classes inside one definition
The important division is this:
- one definition = one named test in the library
- one definition can use one, two, or all three lane classes
- the lane classes are not separate library entries; they are the three ways Calamum gathers evidence for the same test
Controlled library vocabulary (v1)
Calamum treats the following values as the contracted v1 vocabulary.
Status values
seed— scaffold or early placeholder definitionactive— supported definition for normal useexperimental— usable but still being evaluateddeprecated— still readable or runnable for transition purposesdisabled— present in the catalog but not intended for ordinary execution
Category values
adversarial— hostile-input, penetration-style, or abuse-case validationgeneral— mixed or uncategorized definitionbootstrap— setup, installation, or command-surface readinessregression— protects a known workflow or behavior from breakingsecurity— validates defensive trust, signing, access, or safety postureperformance— validates speed, scale, or resource postureintegration— validates interaction across modules, services, or host applicationscompliance— validates policy, contract, or governance conformance
Profile values
default— ordinary day-to-day execution setsmoke— fast confidence checkfast— low-cost local developer checkrelease— required before publishing or promotionnightly— broader scheduled validation pack
Tag values
adversarialaggregateapiauthcatalogclifilesystemprojectreportingretained-evidencesandboxsigningsmoke
Policy flag values
json-firstproject-awarecontainmentlocal-onlysigned-outputprivileged-operationrelease-gatedeterministic-output
Evidence requirement values
report_jsonreport_mdmanifest_jsonchecksums_jsonstdout_capturestderr_capturereceipt_jsonreport_signaturemanifest_signature
Lane classes
pytest— automated code-level assertionssandbox_test— controlled scripted or simulated executionempirical_test— real observed or manual verification
How adversarial testing is represented
- use
category: adversarialwhen hostile challenge is the primary identity of the definition - use tag
adversarialwhen a definition is primarily something else but still includes adversarial coverage - keep
adversarialout of the lane field; it is a test type, not an execution medium - execute adversarial definitions through one or more of the normal lanes
How the three test classes work together
The simplest mental model is: use one definition when you are testing one real thing.
Example definition:
- id:
adversarial-auth-smoke - category:
adversarial - profiles:
smoke,release - tags:
adversarial,auth,api,signing
Then split the same test across the three lane classes like this:
pytestlane- proves the code-level rules work
- example: token validation, permission checks, malformed-input rejection
sandbox_testlane- proves the workflow runs correctly in a controlled runtime
- example: run the CLI against hostile fixtures in a safe local sandbox
empirical_testlane- proves the result still holds in a real observed workflow
- example: operator review of a real auth flow or delegated request path
In short:
pytestasks: does the code logic pass?sandbox_testasks: does the workflow run correctly in a safe controlled environment?empirical_testasks: does it still look correct when a human or real-world run observes it?
Typical full workflow:
- add the definition to
catalog/test_definitions.json - run
calamum test show adversarial-auth-smoke - run
calamum test run adversarial-auth-smoke --dry-run - run
calamum test run adversarial-auth-smoke - inspect one combined retained evidence pack under
.calamum/generated/runs/<run_id>/ - if the run belongs to a job or release lane, generate aggregates under
.calamum/generated/reports/generated/
That is the core model: one named test definition, up to three lane classes, and one retained evidence pack.
Retained evidence contract
Every calamum test run retains:
report.jsonreport.mdchecksums.jsonmanifest.json- per-step stdout and stderr captures
- append-only
.calamum/generated/runs/run_index.jsonl
Aggregate report generation retains:
report.jsonreport.mdmanifest.jsonreceipt.json- checksum sidecars
- optional detached signatures for JSON artifacts
Filesystem layout and default output contract
Calamum uses a small split between tracked inputs and local-only generated outputs.
Tracked by default:
.calamum/project.json— shared project descriptorcatalog/test_definitions.json— tracked definition catalog
Local-only by default:
.calamum/generated/runs/— retained run evidence.calamum/generated/reports/— materialized aggregate reports.calamum/generated/.gitignore— local-only guard so generated output stays untracked
Default tree:
project-root/
├─ .calamum/
│ ├─ project.json
│ └─ generated/
│ ├─ .gitignore
│ ├─ runs/
│ │ ├─ run_index.jsonl
│ │ └─ <run_id>/
│ │ ├─ report.json
│ │ ├─ report.md
│ │ ├─ checksums.json
│ │ ├─ manifest.json
│ │ └─ <lane>/
│ │ ├─ <step>.stdout.txt
│ │ └─ <step>.stderr.txt
│ └─ reports/
│ └─ generated/
│ ├─ report_index.jsonl
│ └─ <scope>/
│ └─ <target>/
│ ├─ report.json
│ ├─ report.md
│ ├─ manifest.json
│ ├─ receipt.json
│ ├─ *.sha256
│ ├─ *.sig
│ └─ history/<timestamp>/
└─ catalog/
└─ test_definitions.json
Notes
- the checked-in
.calamum/project.jsonpoints outputs into.calamum/generated/ calamum project registercan bootstrap the minimal local scaffold for an adopted repo- explicit command flags override descriptor defaults when you intentionally want a different layout
If you do not override anything, retained test runs go to .calamum/generated/runs/ and aggregate reports go to .calamum/generated/reports/generated/.
Project resolution order
Calamum resolves project context in the following order:
- explicit
--project - nearest ancestor
.calamum/project.json CALAMUM_PROJECT- active project stored in local state
Within a resolved project, path resolution follows:
- explicit command flags
- machine-local overlay
- shared descriptor
- built-in defaults
Quick start
From projects/calamum/:
- install in editable mode
- validate the default shared descriptor
- list the current seed definitions
- run a seed smoke definition
- generate a project aggregate from retained evidence
Example flow:
python -m pip install -e .[dev]calamum --versioncalamum project current --jsoncalamum test list --jsoncalamum test run seed-cli-smoke --job local-smoke --jsoncalamum test reports generate --scope project --project calamum-test --json
After the sample run, inspect:
.calamum/generated/runs/run_index.jsonl.calamum/generated/reports/generated/report_index.jsonl
Signing and privileged flows
Aggregate generation supports optional signing inputs for privileged or publication-oriented flows.
Relevant environment variables include:
CALAMUM_ED25519_PRIVATE_KEYCALAMUM_ED25519_PUBLIC_KEYCALAMUM_POLICY_SIGNING_KEYCALAMUM_CONFIG_ROOT
The repository also includes .env.example with placeholder values for local signing and config setup. Keep real keys and machine-local overrides in ignored local files only.
For local development, unsigned or fallback signature workflows can still be useful. For publishable or cross-application flows, prefer Ed25519-backed signing.
Python facade
The package exports a stable Python surface through calamum.api, including helpers to:
- resolve or require project context
- register and validate projects
- list, show, and run definitions
- list and inspect retained runs
- generate, list, and inspect aggregate reports
Why this repo exists
Calamum packages a reusable testing substrate that gives projects one consistent place to define tests, execute them across multiple validation lanes, and retain evidence worth reviewing later.
Design goals:
- deterministic project-aware execution
- retained evidence instead of terminal-history dependence
- JSON-first machine readability with Markdown companions
- regenerable report surfaces
- credible signing hooks without hardcoding secrets
Security
For vulnerability reporting and security guidance, see SECURITY.md.
Contributing
Development and contribution guidance lives in CONTRIBUTING.md.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file calamum_test-0.3.1.tar.gz.
File metadata
- Download URL: calamum_test-0.3.1.tar.gz
- Upload date:
- Size: 3.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
029d62d218f59bc03a48d1e50df210d96cfe0d034375e321097b7a7f4b6e76d4
|
|
| MD5 |
63e1e0bb80025d0b6ac669cf6c129b35
|
|
| BLAKE2b-256 |
fdbdaf5b5a3d54939eb228fe1b96b53e6df3fbe7329825d79a746b38c08ecb6b
|
File details
Details for the file calamum_test-0.3.1-py3-none-any.whl.
File metadata
- Download URL: calamum_test-0.3.1-py3-none-any.whl
- Upload date:
- Size: 45.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
43ff1222dff3f0743048e639b736ffe1add9c32da490d13ad9486fe46b6bae98
|
|
| MD5 |
dc2556fdcc4e9e7816f2ce59f151cd91
|
|
| BLAKE2b-256 |
7a69ef6e1bfa0f839660de70d80c26c34b6844bd2030b126c60f5d00b49f1669
|