rawctx CLI and SDK for semantic packages, answer audit evidence, OTel trace ingest, and trust proofs

These details have not been verified by PyPI

Project description

rawctx CLI

Python CLI and SDK for rawctx Hub. rawctx records AI answer evidence with approved package refs, context hashes, inference commitments, evidence source_refs, correction history, OpenTelemetry (OTel) trace-bundle ingest, and trust proof status. Teams can show what an answer used, which model/config run produced it, whether the proof is externally anchored or still pending, and how the evidence changed over time without replacing their existing observability stack.

Guides:

../docs/guides/package-workflow.md
../docs/guides/metricflow-native-workflow.md
../docs/diff.md
Answer Audit docs: https://hub.rawctx.dev/docs/answer-audit
Independent Reconciliation docs: https://hub.rawctx.dev/docs/reconciliation
Trust proof foundation: https://github.com/pasar6987-create/rawctx/blob/main/docs/spec/trust-proof-foundation.md

OTel support:

rawctx.ingest_otel_trace_bundle() records a submitted OpenTelemetry GenAI trace bundle as answer-audit evidence and ties it to approved semantic package definitions.
LangSmith, Langfuse, or your own OpenTelemetry runtime can remain the trace system of record. rawctx only records the trace evidence submitted for answer review.
OTel is used as an evidence input on top of runtime logs; rawctx does not claim the upstream trace is ground truth by itself.

Commands

User:

rawctx login [--registry URL] [--id-token JWT] [--token-name NAME] [--expires-in-days N] [--no-browser] [--json]
rawctx logout [--local-only] [--json]
rawctx search [QUERY] [--format F] [--source-format F] [--origin all|native|indexed] [--domain D] [--source S] [--tags CSV] [--sort similarity|recent|name] [--page N] [--size N] [--json] [--offline] [--registry URL]
rawctx info PACKAGE_REF [--json] [--offline] [--registry URL]
rawctx download PACKAGE_REF MODEL_PATH [--local-dir DIR] [--stdout] [--offline] [--force] [--json] [--registry URL]
rawctx snapshot-download PACKAGE_REF [--local-dir DIR] [--offline] [--force] [--json] [--registry URL]
rawctx to-prompt PACKAGE_REF [--datasets CSV] [--max-tokens N] [--offline] [--mode agent_context|strict_metric] [--metrics CSV] [--question TEXT] [--budget-policy compact|error_if_required_missing] [--render-format text|xml] [--json] [--registry URL]
rawctx validate [TARGET] [--format auto|manifest|osi] [--show-dataset-measures] [--json]
rawctx pack [TARGET_DIR] [--output-dir DIR] [--json]
rawctx convert --from metricflow --to osi INPUT_PATH --output DIR [--package-name @scope/name] [--package-version X.Y.Z] [--overwrite] [--json]
rawctx publish [TARGET_DIR] [--private] [--org ORG] [--registry URL]
rawctx publish --from-dbt DBT_PROJECT_DIR [--native] [--emit-package DIR] [--package-name @scope/name] [--package-version X.Y.Z] [--private] [--org ORG] [--registry URL]
rawctx diff A B [--format text|json|github|markdown|junit] [--consumer sql|python|llm|all] [--severity breaking|behavioral|cosmetic|all] [--exit-code-on breaking|behavioral|none] [--max-tokens N] [--output PATH]
rawctx diff semantic A B [--format text|json|github|markdown|junit] [--consumer sql|python|llm|all] [--severity breaking|behavioral|cosmetic|all]
rawctx diff prompt A B [--format text|json|github|markdown|junit] [--max-tokens N]
rawctx diff eval A B --questions FILE [--format text|json|github|markdown|junit] [--runs N] [--model NAME]
rawctx trust status [--json] [--registry URL]
rawctx trust policy [--json] [--registry URL]
rawctx trust anchor run [--force] [--json] [--registry URL]
rawctx trust proof answer LOG_ID [--output proof.json] [--json] [--registry URL]
rawctx trust verify proof.json [--online] [--json] [--registry URL]
rawctx reconcile run PAYLOAD.json [--json] [--registry URL]
rawctx reconcile get RUN_ID [--json] [--registry URL]
rawctx reconcile findings RUN_ID [--type TYPE] [--json] [--registry URL]
rawctx reconcile proof RUN_ID [--output proof.json] [--json] [--registry URL]

Maintainer:

rawctx claim PACKAGE_REF [--json] [--registry URL]

Ops:

rawctx index dbt --seed-file PATH [--only owner/name] [--limit N] [--dry-run] [--json] [--registry URL]
rawctx index git --repo owner/name --source-ref REF --package-version X.Y.Z [--package-name NAME] [--scope SCOPE] [--model-glob GLOB ...] [--dry-run] [--json] [--registry URL]

Supported Package Lanes

rawctx currently supports two published package formats:

format=osi: packaged OSI YAML files
format=metricflow: native MetricFlow/dbt snapshot packages

Both lanes support:

rawctx info
rawctx snapshot-download
rawctx.load()
rawctx.to_prompt()
rawctx diff

download PACKAGE_REF MODEL_PATH also works for both lanes, but only for files listed in manifest.models.

rawctx diff accepts three artifact inputs:

@scope/name@version
local package directories
.rawctx.tar.gz archives

It compares artifacts only. It never queries a warehouse.

Independent Reconciliation

Independent reconciliation compares an external reference set of decision ids with rawctx answer-audit logs for the same application, environment, and short period. It is a detective control: it does not block a decision inline, but it does surface missing answer logs, unexpected rawctx records, late backfills, duplicates, and unanchored records. Each run produces a receipt that binds the reference-set root, rawctx-set root, finding counts, period, app/environment, and source commitment. The receipt can be anchored in the trust log and checked with a proof bundle.

Use a reference that the application team cannot rewrite in lockstep with rawctx, such as a lending LOS, ATS, provider usage export, or auditor-controlled extract. The request payload carries raw decision ids so rawctx can compare the sets, but persisted findings and proofs are commitment-oriented. Do not put secrets, API keys, raw customer exports, raw PII, or private system URLs in payload metadata.

rawctx reconcile run reconciliation.json --json
rawctx reconcile findings RUN_ID --json
rawctx reconcile proof RUN_ID --output reconciliation-proof.json
rawctx trust verify reconciliation-proof.json --online

import rawctx

run = rawctx.create_reconciliation_run(
    source_key="los-daily-export",
    source_type="business_system_export",
    period_start="2026-06-12T00:00:00Z",
    period_end="2026-06-13T00:00:00Z",
    application_key="loan_review_bot",
    environment="production",
    key_field="idempotency_key",
    reference_records=[
        {"decision_id": "loan-review:app-123", "occurred_at": "2026-06-12T09:13:00Z"},
    ],
)
findings = rawctx.list_reconciliation_findings(run["id"])
proof = rawctx.proof_reconciliation(run["id"])

Answer Audit Evidence

The Python SDK can register reusable evidence first, then record one audit shell per application answer that cites approved semantic references, external trace ids, inference commitments, evidence source_refs, and later correction, void, or redaction events.

Answer logs are hash-only by default for raw question and answer text. Tenant settings can opt in to raw text storage, but question_hash and answer_hash remain available either way. P3 evidence APIs add a separate evidence path for text, audio, and video assets, sanitized segments, runtime media stream retrieval, short-lived auditor downloads, Text Gate alpha retrieval, tamper-evident ledger verification, and OpenTelemetry (OTel) trace-bundle ingest. Reference audio and video evidence should be registered before the answer log, then retrieved at answer time and cited through the returned source_ref.

Part A inference commitments bind model-run provenance into the same answer record. Use build_inference_commitment() to attach model_ref, runtime ids, input_hash, output_hash, and config_hash. With public sha256: answer hashes, output_hash must match the answer log's answer_hash. With HMAC answer hashes, the public output_hash is bound by inference_commitment_hash. Use provider_attested for hosted model APIs where weights are not independently available; use weight_verified only when you can provide a weights_digest.

Part B inference proofs are additive and disabled by default in production. A tenant policy must enable a backend and budget before proof jobs can run. General users cannot access Hub Platform Admin directly. If you need proof jobs, request tenant policy, backend, and budget enablement from your workspace admin, platform admin, or the rawctx operations team. Platform admins open /admin/tenants/{tenant_slug} in Hub Platform Admin and use Inference proof enabled as the tenant-level master switch. EZKL zkML backend is the costlier proof backend switch under that policy, and it is still bounded by daily/monthly caps, per-job cost caps, queued/concurrent job limits, artifact size, retry limits, and the production global backend allowlist. build_inference_proof() creates a small hash-only envelope for externally computed proofs; raw proof bytes, secrets, prompts, answers, and private weights should stay out of the envelope. Production EZKL verification runs through a worker path, not the API request path. The current pilot architecture uses an on-demand ECS task so there is no always-on proof worker cost while no proof jobs are queued.

Hub web follows the same reference-first model. Tenant managers register audio/video in private workspace settings, copy the source_ref, and pass it in source_refs when creating the answer log. The Media evidence vault is not exposed from the Public Hub navigation or public settings routes. Web uploads are capped at 5 MB; direct SDK/API registration can use the backend evidence limits. Runtime media retrieval always requires a purpose, records an access event, and returns a short-lived stream_url for the agent to read before logging the answer. Manual auditor retrieval remains available through a separate download API. Download filenames are emitted with a safe ASCII fallback plus UTF-8 filename* so non-ASCII filenames work with S3 presigned downloads.

from pathlib import Path

import rawctx

media = rawctx.register_media_evidence_asset(
    filename="support-call.wav",
    mime_type="audio/wav",
    asset_type="audio",
    content=Path("support-call.wav").read_bytes(),
    metadata={"case_id": "case-123"},
    registry="https://api.rawctx.dev",
)

retrieval = rawctx.retrieve_media_evidence(
    media["evidence_asset_id"],
    purpose="answer_generation",
    external_trace_id="req_123",
    external_message_id="msg_456",
    registry="https://api.rawctx.dev",
)
# Stream retrieval["stream_url"] into the agent before it expires.

answer_hash = "sha256:..."
inference_commitment = rawctx.build_inference_commitment(
    model_ref={
        "provider": "openai",
        "model_name": "gpt-4o",
        "model_version": "2026-05-31",
        "weights_digest": None,
        "tokenizer_digest": None,
        "dtype": None,
        "quantization": "none",
    },
    input_hash="sha256:...",
    output_hash=answer_hash,
    config_hash=rawctx.hash_inference_config({"temperature": 0, "top_p": 1}),
    attestation_level="provider_attested",
    runtime={"library_versions": {}, "runtime_ids": {}, "seed": None},
)

log = rawctx.log_answer(
    application_key="analytics_bot",
    environment="production",
    idempotency_key="analytics_bot:req_123:msg_456",
    external_trace_id="req_123",
    question_text="Which plan drove expansion MRR?",
    answer_hash=answer_hash,
    inference_commitment=inference_commitment,
    semantic_refs=[
        {
            "package_ref": "@acme/revenue-metrics",
            "package_version": "1.2.0",
            "context_hash": "sha256:...",
            "metrics": ["mrr"],
        }
    ],
    source_refs=[retrieval["source_ref"]],
    evidence_access_event_ids=[retrieval["access_event_id"]],
    policy_flags={"approved_definition_only": True},
    registry="https://api.rawctx.dev",
)

proof = rawctx.build_inference_proof(
    tenant_id=log["tenant_id"],
    answer_log_id=log["id"],
    proof_type="provider_attestation",
    backend="provider_attested",
    commitment_hash=log["inference_commitment_hash"],
    answer_hash=answer_hash,
    input_hash=inference_commitment["input_hash"],
    output_hash=answer_hash,
    config_hash=inference_commitment["config_hash"],
    model_ref_hash=rawctx.hash_inference_model_ref(inference_commitment["model_ref"]),
    public_inputs_hash=inference_commitment["input_hash"],
    proof_artifact_hash="sha256:...",
    proof_artifact_ref={"uri": "urn:rawctx:proof:...", "size_bytes": 512},
)
job = rawctx.create_inference_proof_job(
    log["id"],
    proof_type="provider_attestation",
    backend="provider_attested",
    estimated_cost_cents=1,
    registry="https://api.rawctx.dev",
)
rawctx.submit_inference_proof(log["id"], proof=proof, job_id=job["id"], registry="https://api.rawctx.dev")

media_assets = rawctx.list_media_evidence_assets(asset_type="audio", registry="https://api.rawctx.dev")
download = rawctx.request_media_evidence_asset_download(
    media["evidence_asset_id"],
    purpose="auditor_media_review",
    registry="https://api.rawctx.dev",
)
# Use download["download_url"] before it expires; rawctx records the access event.

supplemental = rawctx.list_answer_evidence_assets(log["id"], registry="https://api.rawctx.dev")
segments = rawctx.list_answer_segments(log["id"], registry="https://api.rawctx.dev")
retrieved = rawctx.retrieve_text_gate_alpha(
    "expansion MRR evidence",
    application_key="analytics_bot",
    include_hash_only=True,
    registry="https://api.rawctx.dev",
)
otel_log = rawctx.ingest_otel_trace_bundle(
    application_key="analytics_bot",
    external_trace_id="req_123",
    trace_bundle={
        "resourceSpans": [
            {
                "resource": {
                    "attributes": [
                        {"key": "service.name", "value": {"stringValue": "analytics_bot"}},
                    ],
                },
                "scopeSpans": [
                    {
                        "spans": [
                            {
                                "traceId": "4bf92f3577b34da6a3ce929d0e0e4736",
                                "spanId": "00f067aa0ba902b7",
                            }
                        ]
                    }
                ],
            }
        ]
    },
    semantic_refs=[{"package_ref": "@acme/revenue-metrics", "package_version": "1.2.0"}],
    registry="https://api.rawctx.dev",
)

Use RawctxClient or AsyncRawctxClient when a service should share one registry, token, and timeout across answer audit calls:

create_answer_log(), also exposed as top-level log_answer()
register_media_evidence_asset()
list_media_evidence_assets()
retrieve_media_evidence() for runtime agent stream access
request_media_evidence_asset_download()
request_answer_evidence_asset_upload() for log-scoped supplemental assets
register_answer_evidence_asset() for log-scoped supplemental assets
list_answer_evidence_assets() for sanitized answer detail evidence
request_answer_evidence_asset_download() for supplemental original retrieval
list_answer_segments()
ingest_otel_trace_bundle()
retrieve_text_gate_alpha()
append_answer_log_event()
export_answer_logs()
trust_status()
trust_policy()
run_trust_anchor()
proof_answer()
verify_proof_bundle_online()
create_inference_proof_job(), submit_inference_proof(), and list_answer_inference_proofs()
build_inference_commitment(), build_inference_proof(), hash_inference_config(), hash_inference_model_ref(), and sha256_file()
create_reconciliation_run(), list_reconciliation_runs(), get_reconciliation_run(), list_reconciliation_findings(), and proof_reconciliation()

OpenTelemetry support is intentionally on-top: rawctx records the submitted trace bundle and binds it to approved definitions without claiming that the upstream runtime trace itself is ground truth. This is the public-safe path for LangSmith, Langfuse, or another OTel-compatible runtime:

Keep LangSmith/Langfuse as the observability workspace.
Export or forward the OTel trace shape that is already safe for answer review.
Submit only the trace ids, span ids, service name, redacted question/answer text or hashes, semantic_refs, source_refs, and optional inference_commitment that should be bound to the answer audit record.
Do not send LangSmith/Langfuse API keys, private prompts, tool outputs, customer media, or secrets to rawctx unless your tenant retention settings and review policy explicitly allow that data.

Two OTel ingest paths are public:

rawctx.ingest_otel_trace_bundle() posts to /api/answer-audit-logs/otel-trace-bundles. The server maps the submitted bundle's traceId, span count, service.name, and bundle hash into one answer audit log.
POST /api/answer-audit-logs/otel accepts OTLP-style resourceLogs and resourceSpans. It maps service.name or rawctx.audit.application_key, deployment.environment, traceId, spanId, rawctx.audit.* fields, GenAI message attributes, enduser.id_hash, session.id_hash, semantic_refs, source_refs, and inference_commitment into answer audit fields.

Trust Proofs / 신뢰 증명

rawctx trust proofs are separate from the answer payload. They show whether the answer audit record is only locally signed, waiting for external confirmation, or externally anchored.

LOCAL_ONLY: the record has a Merkle leaf, signed tree head, and local receipt, but no independent external anchor yet.
PENDING: the proof structure and signatures are valid, but OpenTimestamps Bitcoin attestation or witness policy completion is still outstanding.
BITCOIN_OBSERVED_PENDING_CONFIRMATION: the OpenTimestamps proof contains a Bitcoin block-header attestation, but rawctx is not yet calling it finalized.
ANCHORED: a confirmed external anchor is present and the required witness condition is satisfied.
INVALID: a structural, hash, signature, anchor, or witness check failed.

Production rawctx can combine:

Merkle inclusion proof for the answer audit leaf
optional inference commitment binding for model/version/config provenance
AWS KMS-signed STH/checkpoint
OpenTimestamps hash-only Bitcoin anchoring
Sigstore Rekor witness receipts after SET, inclusion proof, signed checkpoint, log ID, and entry body verification
receipt bundle retention in S3 Object Lock when configured

Post-quantum trust proof mode

The post-quantum path keeps public anchoring hash-only. rawctx does not publish raw questions, raw answers, tenant ids, app ids, external ticket ids, or proof artifact URIs into OpenTimestamps, Rekor, or public proof bundles. Public bundles use tenant_public_id, public commitments, Merkle leaves, signed tree heads, checkpoints, and receipts.

hmac-sha256 commitments are the privacy layer: they bind question, answer, payload, public tenant, and selected reference commitments to a tenant key so a stored or externally anchored hash is not useful as a raw-text or customer identifier oracle. HMAC is not a signature scheme, but HMAC-SHA-256 remains quantum-safe for this commitment use with an adequate secret key.

ML-DSA is the post-quantum signature layer for AWS KMS STH/checkpoint signing epochs. Rekor is retained as a witness by using a separate Rekor-compatible ECDSA/RSA KMS witness key, because Rekor hashedrekord signatures still need a SHA-256-compatible public-log signature format. Public verification only accepts Rekor witness keys authorized by the signed checkpoint witness_key_refs.

Until a public verifier implements direct ML-DSA signature verification, ML-DSA proofs are not marked verified. API, CLI, and web verifier surfaces report the critical signature check as unsupported and keep the result non-OK with SIGNATURE_UNVERIFIED semantics instead of silently treating the proof as verified.

When present, inference_commitment is canonicalized internally as rawctx.inference.commitment.v1. Public proof bundles carry rawctx.inference.commitment.proof.v1 with a rawctx.inference.commitment.public.v1 summary only; raw model_ref, runtime, and provider attestation material stay out of the public bundle. Verification requires the public proof hash to match the leaf subject's inference_commitment_hash, and commitment.output_hash to match the answer audit answer_hash. When present, inference_proofs are sidecars in the same proof bundle. Offline verification checks that each rawctx.inference.proof.v1 envelope, rawctx.inference.proof.statement.v2 statement, verification hash, proof trust leaf, and inclusion proof bind back to the same tenant context, answer log, answer hash, and Part A commitment hash. Public trust leaves may use tenant_public_id instead of the internal tenant_id when HMAC privacy commitments are required. Legacy statement v1 bundles remain accepted for older receipts, but new Part B proofs use statement v2 with verification_hash. Inference proof status is reported separately from trust_status; a proof can be commitment_bound or adapter_verified while the trust tree is still waiting for external anchoring. For tenant-scoped Part B operation, general users should request proof enablement from a workspace admin, platform admin, or the rawctx operations team; they should not expect direct access to Hub Platform Admin. Platform admins can enable or disable the policy from /admin/tenants/{tenant_slug}. Inference proof enabled allows new proof jobs for that tenant; EZKL zkML backend allows the EZKL backend only when the production allowlist also includes ezkl_v1. The server enforces a reserved-cost floor for EZKL jobs, and production keeps synchronous API verification off. Worker execution stays on a manual one-shot ECS task during pilot usage; move to scheduled or queue-triggered Fargate only after queue volume justifies it.

OpenTimestamps and Rekor do not require a customer account. OpenTimestamps may remain pending until a Bitcoin attestation is available. A Bitcoin-observed proof is stronger than a calendar-pending proof, but it is reported separately until the final confirmation policy is satisfied. Rekor is a public transparency log witness observation; it is not a Bitcoin anchor. S3 Object Lock is a preservation layer for receipt bundles, not the source of independent corroboration.

Korean summary:

신뢰 증명은 답변 원문이 아니라 답변 감사 로그의 leaf, STH 서명, anchor receipt, witness receipt를 검증하기 위한 번들입니다.
PENDING은 실패가 아닙니다. 구조와 서명은 검증 가능하지만 OpenTimestamps 비트코인 attestation 또는 witness 정책 완료를 기다리는 상태입니다.
BITCOIN_OBSERVED_PENDING_CONFIRMATION은 Bitcoin block header 관측은 들어왔지만 최종 확정으로 표현하지 않는 중간 상태입니다.
Rekor는 공개 transparency log에 관측된 witness receipt이고, OpenTimestamps는 체크포인트 해시를 비트코인에 앵커하기 위한 경로입니다.
S3 Object Lock은 receipt bundle 보존 계층입니다. 독립 확인은 외부 앵커와 witness에서 나옵니다.

rawctx trust status --json
rawctx trust proof answer "$ANSWER_LOG_ID" --output proof.json
rawctx trust verify proof.json --online --json

import rawctx

answer_log_id = "anslog_..."

with rawctx.RawctxClient(registry="https://api.rawctx.dev") as client:
    proof = client.proof_answer(answer_log_id)
    online = client.verify_proof_bundle_online(proof)
    latest = client.trust_status()

print(online["trust_status"])
print(latest["external_anchor_status"])

Package Refs and `latest`

Package refs can be exact or pointer-based:

@scope/name@1.2.3 pins one immutable published version
@scope/name@latest asks the registry for the workspace-approved latest version
@scope/name behaves like @scope/name@latest for download, load, and prompt workflows

When the registry returns resolution metadata, rawctx preserves the requested ref, resolved concrete version, and snapshot SHA-256 in the JSON-shaped response or prompt context. Use exact pins in CI or release automation when a job must be independent of future latest promotions.

Compare Packages

Use rawctx diff when you need semantic-level change review instead of raw file diffs.

rawctx diff ./pkg-v1 ./pkg-v2
rawctx diff semantic ./pkg-v1 ./pkg-v2 --format json
rawctx diff prompt ./pkg-v1 ./pkg-v2 --max-tokens 2000
rawctx diff eval ./pkg-v1 ./pkg-v2 --questions questions.jsonl --runs 5 --model mock

The top-level command runs semantic + prompt. eval stays opt-in because it measures model behavior, not deterministic package structure.

Notebook / Code

Search uses the public Hub index first so CLI and SDK results match the logged-out web experience. If a search returns no public matches and you have a token configured, rawctx retries with authenticated search.

Notebook shell style:

!rawctx search "semantic model" --sort similarity --json
!rawctx info @scope/name --json
!rawctx snapshot-download @scope/name --json
!rawctx download @scope/name models/customers.yml --json
!rawctx to-prompt @scope/name --datasets customers,order_item --max-tokens 2000
!rawctx validate ./my-package --json

Python API:

import rawctx

result = rawctx.search("semantic model", registry="https://api.rawctx.dev", sort="similarity")
pkg = rawctx.info("@scope/name", registry="https://api.rawctx.dev")
model = rawctx.load("@scope/name", registry="https://api.rawctx.dev")
prompt = rawctx.to_prompt(
    "@scope/name",
    datasets=["customers", "order_item"],
    max_tokens=2000,
    registry="https://api.rawctx.dev",
)

print(model.format_name)    # "osi" or "metricflow"
print(model.datasets)       # normalized dataset names
print(model.measures)       # [Measure(name="...", ...)]
print(model.dimensions)     # [Dimension(name="...", ...)]
print(model.relationships)  # [Relationship(name="...", ...)]
print(prompt)
print(pkg["model_paths"])

snapshot_dir = rawctx.snapshot_download("@scope/name", registry="https://api.rawctx.dev")
model_path = rawctx.download("@scope/name", "models/customers.yml", registry="https://api.rawctx.dev")
validation = rawctx.validate("./my-package")
semantic = rawctx.semantic_diff("./pkg-v1", "./pkg-v2")
prompt_diff = rawctx.prompt_diff("./pkg-v1", "./pkg-v2", max_tokens=2000)
combined = rawctx.diff_artifacts("./pkg-v1", "./pkg-v2")

Async Python API:

import asyncio
import rawctx

async def main():
    async with rawctx.AsyncRawctxClient(registry="https://api.rawctx.dev") as client:
        result = await client.search("semantic model", sort="similarity")
        model = await client.load("@scope/name")
        prompt = await client.to_prompt("@scope/name", datasets=["customers", "order_item"])
        snapshot_dir = await client.snapshot_download("@scope/name")
        diff_report = await client.diff("./pkg-v1", "./pkg-v2")
        return result, model, prompt, snapshot_dir, diff_report

asyncio.run(main())

`to_prompt()` Behavior

rawctx.to_prompt() turns a package snapshot into compact LLM context. It uses the same normalized semantic objects as load(), then applies package metadata, dataset filters, and prompt budget settings to render agent-ready text.

The same prompt compiler is available from the CLI:

rawctx to-prompt @scope/name --datasets customers,order_item --max-tokens 2000
rawctx to-prompt @scope/name --mode strict_metric --metrics mrr --render-format xml
rawctx to-prompt @scope/name --json

The rendered prompt keeps a predictable section shape:

Domain: {domain} ({package_name})

Models:
...

Datasets:
...

Metrics:
...

Relationships:
...

Dataset filters preserve the requested order, drop duplicates, and fail with UsageError: Unknown dataset(s): ... when a requested dataset is not present. Selecting a subset keeps context focused on the requested datasets and their relevant relationships.

max_tokens is a practical size target, not a model-specific tokenizer guarantee. When the budget is tight, rawctx prioritizes high-signal semantic context and compacts lower-priority detail. Use return_context=True when you need the selected objects, estimated size, render hash, omissions, and warnings for logging or review.

Download Behavior

download fetches one file listed in manifest.models
snapshot-download materializes the full extracted package tree
for native MetricFlow packages, snapshot-download is the primary handoff because it restores the full dbt-style snapshot
load() and to_prompt() normalize both OSI and native MetricFlow packages into the same typed Python structures
when using snapshot-download --local-dir, prefer a new or empty directory. --force only replaces an existing rawctx snapshot directory and refuses to wipe the current working directory or unrelated folders
indexed packages remain preview-only and cannot be downloaded directly

Validate / Pack / Publish

validate, pack, and publish all start from a local package directory.

validate: checks the manifest and validates the package according to manifest.format
pack: builds a deterministic local .rawctx.tar.gz
publish: validates again, rebuilds a temporary archive, calculates the checksum, uploads bytes, and completes the version

Published versions are immutable release artifacts. Private workspaces can optionally require approval before a published version is promoted to latest. When that governance policy is enabled, publish still creates the version, but latest moves only after the request is reviewed and approved in rawctx Hub. When governance is disabled, direct latest promotion keeps the existing lightweight behavior.

Package directories are no longer OSI-only.

OSI package example:

my-osi-package/
  rawctx.yaml
  README.md
  models/
    sales_summary.osi.yaml
    customers.osi.yaml

Native MetricFlow package example:

my-metricflow-package/
  rawctx.yaml
  README.md
  dbt_project.yml
  models/
    customers.yml
    orders.yml

Native MetricFlow manifest example:

name: "@demo/jaffle-metrics"
version: "1.0.0"
format: "metricflow"
source_format: "metricflow"
description: "Native MetricFlow package"
models:
  - models/customers.yml
  - models/orders.yml
include:
  - dbt_project.yml
repository: "https://github.com/dbt-labs/jaffle-sl-template"

Notes:

format supports osi and metricflow
models must stay relative and must resolve inside the package directory
include is optional and is mainly useful for native packages that need extra project files such as dbt_project.yml
standalone file validation is still limited to manifest files and OSI files, so rawctx validate models/customers.yml is not a native MetricFlow file validator by itself

Convert Workflow

Inspect-first OSI flow:

rawctx convert --from metricflow --to osi ./my-dbt-project --output ./dist/pkg
rawctx validate ./dist/pkg --json
rawctx pack ./dist/pkg --output-dir ./dist --json

Publish Directly From dbt

Convert to OSI and publish:

rawctx login
rawctx publish --from-dbt ./my-dbt-project --emit-package ./dist/pkg

Publish a native MetricFlow package:

rawctx login
rawctx publish --from-dbt ./my-dbt-project --native --emit-package ./dist/native-pkg
rawctx publish --from-dbt ./my-dbt-project --native --package-name @your-scope/jaffle-shop --package-version 1.2.3

Use --emit-package when you want the generated package directory to remain on disk after the publish run.

Latest Promotion Governance

Governance is about changing the official pointer, not editing the artifact:

published immutable version
        |
request latest promotion in rawctx Hub
        |
review diff or prompt preview when required
        |
approval threshold reached
        |
latest resolves to that concrete version

Workspace admins can enable approval before latest promotion, set the required approval count, choose whether requesters may self-approve, and require semantic diff review. The first governance surface is in the authenticated Hub UI: workspace settings configure the policy, and package version pages create, approve, reject, or cancel latest promotion requests.

CLI and Python consumers do not need a separate governance command to use the result. They keep using exact refs or the approved latest pointer:

rawctx snapshot-download @scope/name@1.2.3
rawctx to-prompt @scope/name@latest --max-tokens 1200

If current latest changes while a request is pending, rawctx marks that request stale instead of moving latest from an unexpected base version. Existing pending requests keep the approval threshold captured when the request was created.

Auth Flow (Auto + Fallback)

Run rawctx login.
CLI opens or prints the OAuth URL from POST /api/auth/login and falls back to the legacy GitHub endpoint if needed.
Complete login in the browser.
CLI automatically polls OAuth session status and captures id_token when the registry supports it.
CLI calls POST /api/auth/token and stores the API token in ~/.rawctx/config.yaml.

Manual fallback:

rawctx login --id-token '<JWT>'

Config and Environment

Config file (default): ~/.rawctx/config.yaml

registry: "https://api.rawctx.dev"
auth:
  token: "rxctx_..."
  token_id: "uuid"
  token_name: "rawctx-cli"
  issued_at: "2026-02-28T00:00:00+00:00"
profile:
  username: "owner"

Environment overrides:

RAWCTX_CONFIG (config path)
RAWCTX_REGISTRY (registry URL)
RAWCTX_TOKEN (auth token)

Priority: CLI option > env var > config > default.

Offline Mode

--offline is supported for:

search
info
download
snapshot-download

Cache paths:

index: ~/.rawctx/cache/packages.json
archives: ~/.rawctx/cache/archives/@scope/name/<version>.rawctx.tar.gz
snapshots: ~/.rawctx/packages/@scope/name/<version>/

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.3.39

Jun 17, 2026

0.3.38

Jun 14, 2026

0.3.37

Jun 12, 2026

0.3.36

Jun 12, 2026

0.3.35

Jun 12, 2026

0.3.34

Jun 11, 2026

0.3.33

Jun 7, 2026

0.3.32

Jun 1, 2026

0.3.31

Jun 1, 2026

0.3.30

May 31, 2026

0.3.29

May 31, 2026

0.3.28

May 29, 2026

0.3.27

May 29, 2026

0.3.26

May 28, 2026

0.3.25

May 28, 2026

0.3.24

May 28, 2026

0.3.23

May 28, 2026

0.3.22

May 28, 2026

0.3.21

May 22, 2026

0.3.20

May 4, 2026

0.3.19

May 4, 2026

0.3.18

Apr 25, 2026

0.3.17

Apr 25, 2026

0.3.16

Apr 25, 2026

0.3.15

Apr 25, 2026

0.3.14

Apr 24, 2026

0.3.13

Apr 24, 2026

0.3.12

Apr 17, 2026

0.3.11

Apr 16, 2026

0.3.10

Apr 14, 2026

0.3.9

Apr 14, 2026

0.3.8

Apr 14, 2026

0.3.7

Apr 14, 2026

0.3.6

Apr 12, 2026

0.3.5

Apr 12, 2026

0.3.4

Mar 16, 2026

0.3.3

Mar 16, 2026

0.3.2

Mar 13, 2026

0.3.1

Mar 8, 2026

0.3.0

Mar 8, 2026

0.2.8

Mar 8, 2026

0.2.7

Mar 8, 2026

0.2.6

Mar 4, 2026

0.2.5

Mar 2, 2026

0.2.4

Mar 2, 2026

0.2.3

Mar 2, 2026

0.2.2

Mar 2, 2026

0.2.1

Mar 2, 2026

0.2.0

Mar 2, 2026

0.1.0

Mar 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rawctx-0.3.39.tar.gz (211.2 kB view details)

Uploaded Jun 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rawctx-0.3.39-py3-none-any.whl (158.2 kB view details)

Uploaded Jun 17, 2026 Python 3

File details

Details for the file rawctx-0.3.39.tar.gz.

File metadata

Download URL: rawctx-0.3.39.tar.gz
Upload date: Jun 17, 2026
Size: 211.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for rawctx-0.3.39.tar.gz
Algorithm	Hash digest
SHA256	`61c85a0140eae8c7302b5e6884dae6a450dbbe93ff9927e0a475d78655669df7`
MD5	`776cfb7a9bc762ba7a51c857498c5f9a`
BLAKE2b-256	`78d21b06f4801daa54344ac0764206a866f365f9f53d83b5ed760a7fe5e34088`

See more details on using hashes here.

File details

Details for the file rawctx-0.3.39-py3-none-any.whl.

File metadata

Download URL: rawctx-0.3.39-py3-none-any.whl
Upload date: Jun 17, 2026
Size: 158.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for rawctx-0.3.39-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4a89169df07c428225f54a76b91211753f320c4b34d4646d62665f0395cffc9d`
MD5	`4184e07ffd8d8e3513352050809cff5a`
BLAKE2b-256	`3418bde5e9335aab64250c4e2fa0a60fea04b38609bcfe770072575db0489210`

See more details on using hashes here.

rawctx 0.3.39

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

rawctx CLI

Commands

Supported Package Lanes

Independent Reconciliation

Answer Audit Evidence

Trust Proofs / 신뢰 증명

Post-quantum trust proof mode

Package Refs and latest

Compare Packages

Notebook / Code

to_prompt() Behavior

Download Behavior

Validate / Pack / Publish

Convert Workflow

Publish Directly From dbt

Latest Promotion Governance

Auth Flow (Auto + Fallback)

Config and Environment

Offline Mode

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Package Refs and `latest`

`to_prompt()` Behavior