Project-aware consolidated testing CLI for pytest, sandbox, empirical, and retained aggregate validation lanes.

These details have not been verified by PyPI

Project links

Project description

Calamum Test

Calamum logo

Calamum Test is a standalone, project-aware testing harness for consolidating pytest, sandbox_test, and empirical_test lanes behind one retained-evidence CLI and Python facade.

It is built for teams that want test execution, project context, and generated evidence to stay organized, reviewable, and reproducible.

Public repository: https://github.com/joediggidyyy/calamum

Install and verify

Install from PyPI using the published distribution name:

pip install calamum-test

Run the installed CLI using the runtime command name:

calamum --version
calamum -h

Important naming note:

PyPI package / dependency name: calamum-test
import package and runtime command: calamum

Package highlights include:

a shared .calamum/project.json descriptor model
project-aware resolution for catalog, runs, and reports roots
a stable Python facade in calamum.api
tracked catalog metadata for profiles, tags, policy flags, and evidence requirements
retained run manifests, checksums, and per-step output captures
aggregate report generation for job, project, and domain scopes
optional signing inputs for privileged or publication-oriented report flows

Command surface

Top-level families

calamum test — validation definition discovery, execution, and retained evidence review
calamum project — project registration, active-project state, and context readback
calamum monitor — current monitor-shell scaffolding and capability readback

Test execution

calamum test list
calamum test show <definition_id>
calamum test run <definition_id>
calamum test runs list
calamum test runs show <run_id>

A definition_id is the exact id of a test definition in the catalog. Use calamum test list to discover available ids, then pass one of those ids to calamum test show or calamum test run.

The catalog currently includes definitions such as:

seed-cli-smoke
seed-adversarial-smoke

Long-running subprocess steps emit heartbeat lines to stderr while a run is active. Those heartbeats are transient status signals; the retained artifacts are written when the run completes.

Project management

calamum project register
calamum project set <project>
calamum project current
calamum project validate [<project>]
calamum project list
calamum project show <project>

Compatibility note:

calamum test project ... remains available as a compatibility alias for the top-level project family.

Monitor scaffolding

calamum monitor capability list

The monitor family currently provides capability inspection through calamum monitor capability list.

Aggregate reporting

calamum test reports list
calamum test reports show <report_ref>
calamum test reports generate --scope job --job <job_id>
calamum test reports generate --scope project [--project <project>]
calamum test reports generate --scope domain --domain <domain>

Adding tests to the library

In Calamum, the public test library is the tracked catalog at catalog/test_definitions.json.

To add a definition:

add a new definition object to the catalog definitions list
give it a stable id, title, summary, status, and category
classify it with:
- profiles — reusable bundles such as smoke, release, or nightly
- tags — cross-cutting labels for search and future selectors
- policy_flags — execution or governance rules
- evidence_requirements — retained outputs the definition must produce
declare step arrays under the canonical lanes:
- pytest
- sandbox_test
- empirical_test
validate the new entry by running:
- calamum test list
- calamum test show <definition_id>
- calamum test run <definition_id> --dry-run

Minimal definition shape:

{
	"id": "adversarial-auth-smoke",
	"title": "Adversarial auth smoke",
	"summary": "Challenge the authentication path with hostile-input and retained-evidence checks.",
	"status": "active",
	"category": "adversarial",
	"profiles": ["smoke", "release"],
	"tags": ["adversarial", "auth", "api", "signing"],
	"policy_flags": ["containment", "json-first", "project-aware", "release-gate"],
	"evidence_requirements": ["report_json", "report_md", "manifest_json", "checksums_json"],
	"default_lanes": ["pytest", "sandbox_test"],
	"lanes": {
		"pytest": [],
		"sandbox_test": [],
		"empirical_test": []
	}
}

Definitions are currently added directly in catalog/test_definitions.json, which remains the authoritative editing surface.

What the fields mean

category = what kind of test this is
profiles = when or why you run it
tags = what area it touches
policy_flags = special execution or governance rules
evidence_requirements = which retained outputs must exist after the run
pytest / sandbox_test / empirical_test = the three execution lane classes inside one definition

The important division is this:

one definition = one named test in the library
one definition can use one, two, or all three lane classes
the lane classes are not separate library entries; they are the three ways Calamum gathers evidence for the same test

Controlled library vocabulary (v1)

Calamum treats the following values as the contracted v1 vocabulary.

Status values

seed — scaffold or early placeholder definition
active — supported definition for normal use
experimental — usable but still being evaluated
deprecated — still readable or runnable for transition purposes
disabled — present in the catalog but not intended for ordinary execution

Category values

adversarial — hostile-input, penetration-style, or abuse-case validation
general — mixed or uncategorized definition
bootstrap — setup, installation, or command-surface readiness
regression — protects a known workflow or behavior from breaking
security — validates defensive trust, signing, access, or safety posture
performance — validates speed, scale, or resource posture
integration — validates interaction across modules, services, or host applications
compliance — validates policy, contract, or governance conformance

Profile values

default — ordinary day-to-day execution set
smoke — fast confidence check
fast — low-cost local developer check
release — required before publishing or promotion
nightly — broader scheduled validation pack

Policy flag values

json-first
project-aware
containment
local-only
signed-output
privileged-operation
release-gate
deterministic-output

Evidence requirement values

report_json
report_md
manifest_json
checksums_json
stdout_capture
stderr_capture
receipt_json
report_signature
manifest_signature

Lane classes

pytest — automated code-level assertions
sandbox_test — controlled scripted or simulated execution
empirical_test — real observed or manual verification

How adversarial testing is represented

use category: adversarial when hostile challenge is the primary identity of the definition
use tag adversarial when a definition is primarily something else but still includes adversarial coverage
keep adversarial out of the lane field; it is a test type, not an execution medium
execute adversarial definitions through one or more of the normal lanes

How the three test classes work together

The simplest mental model is: use one definition when you are testing one real thing.

Example definition:

id: adversarial-auth-smoke
category: adversarial
profiles: smoke, release
tags: adversarial, auth, api, signing

Then split the same test across the three lane classes like this:

pytest lane
- proves the code-level rules work
- example: token validation, permission checks, malformed-input rejection
sandbox_test lane
- proves the workflow runs correctly in a controlled runtime
- example: run the CLI against hostile fixtures in a safe local sandbox
empirical_test lane
- proves the result still holds in a real observed workflow
- example: operator review of a real auth flow or delegated request path

In short:

pytest asks: does the code logic pass?
sandbox_test asks: does the workflow run correctly in a safe controlled environment?
empirical_test asks: does it still look correct when a human or real-world run observes it?

Typical full workflow:

add the definition to catalog/test_definitions.json
run calamum test show adversarial-auth-smoke
run calamum test run adversarial-auth-smoke --dry-run
run calamum test run adversarial-auth-smoke
inspect one combined retained evidence pack under .calamum/generated/runs/<run_id>/
if the run belongs to a job or release lane, generate aggregates under .calamum/generated/reports/generated/

That is the core model: one named test definition, up to three lane classes, and one retained evidence pack.

Retained evidence contract

Every calamum test run retains:

report.json
report.md
checksums.json
manifest.json
per-step stdout and stderr captures
append-only .calamum/generated/runs/run_index.jsonl

Aggregate report generation retains:

report.json
report.md
manifest.json
receipt.json
checksum sidecars
optional detached signatures for JSON artifacts

Filesystem layout and default output contract

Calamum uses a small split between tracked inputs and local-only generated outputs.

Tracked by default:

.calamum/project.json — shared project descriptor
catalog/test_definitions.json — tracked definition catalog

Local-only by default:

.calamum/generated/runs/ — retained run evidence
.calamum/generated/reports/ — materialized aggregate reports
.calamum/generated/.gitignore — local-only guard so generated output stays untracked

Default tree:

project-root/
├─ .calamum/
│  ├─ project.json
│  └─ generated/
│     ├─ .gitignore
│     ├─ runs/
│     │  ├─ run_index.jsonl
│     │  └─ <run_id>/
│     │     ├─ report.json
│     │     ├─ report.md
│     │     ├─ checksums.json
│     │     ├─ manifest.json
│     │     └─ <lane>/
│     │        ├─ <step>.stdout.txt
│     │        └─ <step>.stderr.txt
│     └─ reports/
│        └─ generated/
│           ├─ report_index.jsonl
│           └─ <scope>/
│              └─ <target>/
│                 ├─ report.json
│                 ├─ report.md
│                 ├─ manifest.json
│                 ├─ receipt.json
│                 ├─ *.sha256
│                 ├─ *.sig
│                 └─ history/<timestamp>/
└─ catalog/
	└─ test_definitions.json

Notes

the checked-in .calamum/project.json points outputs into .calamum/generated/
calamum project register can bootstrap the minimal local scaffold for an adopted repo
explicit command flags override descriptor defaults when you intentionally want a different layout

If you do not override anything, retained test runs go to .calamum/generated/runs/ and aggregate reports go to .calamum/generated/reports/generated/.

Project resolution order

Calamum resolves project context in the following order:

explicit --project
nearest ancestor .calamum/project.json
CALAMUM_PROJECT
active project stored in local state

Within a resolved project, path resolution follows:

explicit command flags
machine-local overlay
shared descriptor
built-in defaults

Quick start

From projects/calamum/:

install in editable mode
validate the default shared descriptor
list the current seed definitions
run a seed smoke definition
generate a project aggregate from retained evidence

Example flow:

python -m pip install -e .[dev]
calamum --version
calamum project current --json
calamum test list --json
calamum test run seed-cli-smoke --job local-smoke --json
calamum test reports generate --scope project --project calamum-test --json

After the sample run, inspect:

.calamum/generated/runs/run_index.jsonl
.calamum/generated/reports/generated/report_index.jsonl

Signing and privileged flows

Aggregate generation supports optional signing inputs for privileged or publication-oriented flows.

Relevant environment variables include:

CALAMUM_ED25519_PRIVATE_KEY
CALAMUM_ED25519_PUBLIC_KEY
CALAMUM_POLICY_SIGNING_KEY
CALAMUM_CONFIG_ROOT

The repository also includes .env.example with placeholder values for local signing and config setup. Keep real keys and machine-local overrides in ignored local files only.

For local development, unsigned or fallback signature workflows can still be useful. For publishable or cross-application flows, prefer Ed25519-backed signing.

Python facade

The package exports a stable Python surface through calamum.api, including helpers to:

resolve or require project context
register and validate projects
list, show, and run definitions
list and inspect retained runs
generate, list, and inspect aggregate reports

Why this repo exists

Calamum packages a reusable testing substrate that gives projects one consistent place to define tests, execute them across multiple validation lanes, and retain evidence worth reviewing later.

Design goals:

deterministic project-aware execution
retained evidence instead of terminal-history dependence
JSON-first machine readability with Markdown companions
regenerable report surfaces
credible signing hooks without hardcoding secrets

Security

For vulnerability reporting and security guidance, see SECURITY.md.

Contributing

Development and contribution guidance lives in CONTRIBUTING.md.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.3.1

May 3, 2026

0.3.0

May 2, 2026

0.2.0

Apr 24, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

calamum_test-0.3.1.tar.gz (3.3 MB view details)

Uploaded May 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

calamum_test-0.3.1-py3-none-any.whl (45.7 kB view details)

Uploaded May 3, 2026 Python 3

File details

Details for the file calamum_test-0.3.1.tar.gz.

File metadata

Download URL: calamum_test-0.3.1.tar.gz
Upload date: May 3, 2026
Size: 3.3 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for calamum_test-0.3.1.tar.gz
Algorithm	Hash digest
SHA256	`029d62d218f59bc03a48d1e50df210d96cfe0d034375e321097b7a7f4b6e76d4`
MD5	`63e1e0bb80025d0b6ac669cf6c129b35`
BLAKE2b-256	`fdbdaf5b5a3d54939eb228fe1b96b53e6df3fbe7329825d79a746b38c08ecb6b`

See more details on using hashes here.

File details

Details for the file calamum_test-0.3.1-py3-none-any.whl.

File metadata

Download URL: calamum_test-0.3.1-py3-none-any.whl
Upload date: May 3, 2026
Size: 45.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for calamum_test-0.3.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`43ff1222dff3f0743048e639b736ffe1add9c32da490d13ad9486fe46b6bae98`
MD5	`dc2556fdcc4e9e7816f2ce59f151cd91`
BLAKE2b-256	`7a69ef6e1bfa0f839660de70d80c26c34b6844bd2030b126c60f5d00b49f1669`

See more details on using hashes here.

calamum-test 0.3.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Calamum Test

Install and verify

Command surface

Top-level families

Test execution

Project management

Monitor scaffolding

Aggregate reporting

Adding tests to the library

What the fields mean

Controlled library vocabulary (v1)

Status values

Category values

Profile values

Tag values

Policy flag values

Evidence requirement values

Lane classes

How adversarial testing is represented

How the three test classes work together

Retained evidence contract

Filesystem layout and default output contract

Notes

Project resolution order

Quick start

Signing and privileged flows

Python facade

Why this repo exists

Security

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes