rawctx CLI and SDK for semantic packages, answer audit evidence, and OTel trace ingest
Project description
rawctx CLI
Python CLI and SDK for rawctx Hub. rawctx uses OpenTelemetry (OTel)
trace-bundle ingest for Answer Audit, so teams can bind GenAI runtime traces to
the approved package refs, context hashes, and evidence source_refs behind an
answer without replacing their existing observability stack.
Guides:
../docs/guides/package-workflow.md../docs/guides/metricflow-native-workflow.md../docs/diff.md- Answer Audit docs: https://hub.rawctx.dev/docs/answer-audit
OTel support:
rawctx.ingest_otel_trace_bundle()records submitted OpenTelemetry GenAI trace bundles and ties them to approved semantic package definitions.- OTel is used as an evidence input on top of your runtime logs; rawctx does not claim the upstream trace is ground truth by itself.
Commands
User:
rawctx login [--registry URL] [--id-token JWT] [--token-name NAME] [--expires-in-days N] [--no-browser] [--json]rawctx logout [--local-only] [--json]rawctx search [QUERY] [--format F] [--source-format F] [--origin all|native|indexed] [--domain D] [--source S] [--tags CSV] [--sort similarity|recent|name] [--page N] [--size N] [--json] [--offline] [--registry URL]rawctx info PACKAGE_REF [--json] [--offline] [--registry URL]rawctx download PACKAGE_REF MODEL_PATH [--local-dir DIR] [--stdout] [--offline] [--force] [--json] [--registry URL]rawctx snapshot-download PACKAGE_REF [--local-dir DIR] [--offline] [--force] [--json] [--registry URL]rawctx to-prompt PACKAGE_REF [--datasets CSV] [--max-tokens N] [--offline] [--mode agent_context|strict_metric] [--metrics CSV] [--question TEXT] [--budget-policy compact|error_if_required_missing] [--render-format text|xml] [--json] [--registry URL]rawctx validate [TARGET] [--format auto|manifest|osi] [--show-dataset-measures] [--json]rawctx pack [TARGET_DIR] [--output-dir DIR] [--json]rawctx convert --from metricflow --to osi INPUT_PATH --output DIR [--package-name @scope/name] [--package-version X.Y.Z] [--overwrite] [--json]rawctx publish [TARGET_DIR] [--private] [--org ORG] [--registry URL]rawctx publish --from-dbt DBT_PROJECT_DIR [--native] [--emit-package DIR] [--package-name @scope/name] [--package-version X.Y.Z] [--private] [--org ORG] [--registry URL]rawctx diff A B [--format text|json|github|markdown|junit] [--consumer sql|python|llm|all] [--severity breaking|behavioral|cosmetic|all] [--exit-code-on breaking|behavioral|none] [--max-tokens N] [--output PATH]rawctx diff semantic A B [--format text|json|github|markdown|junit] [--consumer sql|python|llm|all] [--severity breaking|behavioral|cosmetic|all]rawctx diff prompt A B [--format text|json|github|markdown|junit] [--max-tokens N]rawctx diff eval A B --questions FILE [--format text|json|github|markdown|junit] [--runs N] [--model NAME]
Maintainer:
rawctx claim PACKAGE_REF [--json] [--registry URL]
Ops:
rawctx index dbt --seed-file PATH [--only owner/name] [--limit N] [--dry-run] [--json] [--registry URL]rawctx index git --repo owner/name --source-ref REF --package-version X.Y.Z [--package-name NAME] [--scope SCOPE] [--model-glob GLOB ...] [--dry-run] [--json] [--registry URL]
Supported Package Lanes
rawctx currently supports two published package formats:
format=osi: packaged OSI YAML filesformat=metricflow: native MetricFlow/dbt snapshot packages
Both lanes support:
rawctx inforawctx snapshot-downloadrawctx.load()rawctx.to_prompt()rawctx diff
download PACKAGE_REF MODEL_PATH also works for both lanes, but only for files listed in manifest.models.
rawctx diff accepts three artifact inputs:
@scope/name@version- local package directories
.rawctx.tar.gzarchives
It compares artifacts only. It never queries a warehouse.
Answer Audit Evidence
The Python SDK can register reusable evidence first, then record one audit shell
per application answer that cites approved semantic references, external trace
ids, evidence source_refs, and later correction, void, or redaction events.
Answer logs are hash-only by default for raw question and answer text. Tenant
settings can opt in to raw text storage, but question_hash and answer_hash
remain available either way. P3 evidence APIs add a separate evidence path for
text, audio, and video assets, sanitized segments, runtime media stream
retrieval, short-lived auditor downloads, Text Gate alpha retrieval,
tamper-evident ledger verification, and OpenTelemetry (OTel) trace-bundle
ingest.
Reference audio and video evidence should be registered before the answer log,
then retrieved at answer time and cited through the returned source_ref.
Hub web follows the same reference-first model. Tenant managers register
audio/video in private workspace settings, copy the source_ref, and pass it in
source_refs when creating the answer log. The Media evidence vault is not
exposed from the Public Hub navigation or public settings routes. Web uploads
are capped at 5 MB; direct SDK/API registration can use the backend evidence
limits. Runtime media retrieval always requires a purpose, records an access
event, and returns a short-lived stream_url for the agent to read before
logging the answer. Manual auditor retrieval remains available through a
separate download API. Download filenames are emitted with a safe ASCII fallback
plus UTF-8 filename* so non-ASCII filenames work with S3 presigned downloads.
from pathlib import Path
import rawctx
media = rawctx.register_media_evidence_asset(
filename="support-call.wav",
mime_type="audio/wav",
asset_type="audio",
content=Path("support-call.wav").read_bytes(),
metadata={"case_id": "case-123"},
registry="https://api.rawctx.dev",
)
retrieval = rawctx.retrieve_media_evidence(
media["evidence_asset_id"],
purpose="answer_generation",
external_trace_id="req_123",
external_message_id="msg_456",
registry="https://api.rawctx.dev",
)
# Stream retrieval["stream_url"] into the agent before it expires.
log = rawctx.log_answer(
application_key="analytics_bot",
environment="production",
idempotency_key="analytics_bot:req_123:msg_456",
external_trace_id="req_123",
question_text="Which plan drove expansion MRR?",
answer_text="The Team plan drove the largest expansion.",
semantic_refs=[
{
"package_ref": "@acme/revenue-metrics",
"package_version": "1.2.0",
"context_hash": "sha256:...",
"metrics": ["mrr"],
}
],
source_refs=[retrieval["source_ref"]],
evidence_access_event_ids=[retrieval["access_event_id"]],
policy_flags={"approved_definition_only": True},
registry="https://api.rawctx.dev",
)
media_assets = rawctx.list_media_evidence_assets(asset_type="audio", registry="https://api.rawctx.dev")
download = rawctx.request_media_evidence_asset_download(
media["evidence_asset_id"],
purpose="auditor_media_review",
registry="https://api.rawctx.dev",
)
# Use download["download_url"] before it expires; rawctx records the access event.
supplemental = rawctx.list_answer_evidence_assets(log["id"], registry="https://api.rawctx.dev")
segments = rawctx.list_answer_segments(log["id"], registry="https://api.rawctx.dev")
retrieved = rawctx.retrieve_text_gate_alpha(
"expansion MRR evidence",
application_key="analytics_bot",
include_hash_only=True,
registry="https://api.rawctx.dev",
)
otel_log = rawctx.ingest_otel_trace_bundle(
application_key="analytics_bot",
external_trace_id="req_123",
trace_bundle={"resourceSpans": []},
semantic_refs=[{"package_ref": "@acme/revenue-metrics", "package_version": "1.2.0"}],
registry="https://api.rawctx.dev",
)
Use RawctxClient or AsyncRawctxClient when a service should share one
registry, token, and timeout across answer audit calls:
create_answer_log(), also exposed as top-levellog_answer()register_media_evidence_asset()list_media_evidence_assets()retrieve_media_evidence()for runtime agent stream accessrequest_media_evidence_asset_download()request_answer_evidence_asset_upload()for log-scoped supplemental assetsregister_answer_evidence_asset()for log-scoped supplemental assetslist_answer_evidence_assets()for sanitized answer detail evidencerequest_answer_evidence_asset_download()for supplemental original retrievallist_answer_segments()ingest_otel_trace_bundle()retrieve_text_gate_alpha()append_answer_log_event()export_answer_logs()
OpenTelemetry support is intentionally on-top: rawctx records the submitted trace bundle and binds it to approved definitions without claiming that the upstream runtime trace itself is ground truth.
Package Refs and latest
Package refs can be exact or pointer-based:
@scope/name@1.2.3pins one immutable published version@scope/name@latestasks the registry for the workspace-approved latest version@scope/namebehaves like@scope/name@latestfor download, load, and prompt workflows
When the registry returns resolution metadata, rawctx preserves the requested ref, resolved concrete version, and snapshot SHA-256 in the JSON-shaped response or prompt context. Use exact pins in CI or release automation when a job must be independent of future latest promotions.
Compare Packages
Use rawctx diff when you need semantic-level change review instead of raw file diffs.
rawctx diff ./pkg-v1 ./pkg-v2
rawctx diff semantic ./pkg-v1 ./pkg-v2 --format json
rawctx diff prompt ./pkg-v1 ./pkg-v2 --max-tokens 2000
rawctx diff eval ./pkg-v1 ./pkg-v2 --questions questions.jsonl --runs 5 --model mock
The top-level command runs semantic + prompt. eval stays opt-in because it measures model behavior, not deterministic package structure.
Notebook / Code
Search uses the public Hub index first so CLI and SDK results match the logged-out web experience. If a search returns no public matches and you have a token configured, rawctx retries with authenticated search.
Notebook shell style:
!rawctx search "semantic model" --sort similarity --json
!rawctx info @scope/name --json
!rawctx snapshot-download @scope/name --json
!rawctx download @scope/name models/customers.yml --json
!rawctx to-prompt @scope/name --datasets customers,order_item --max-tokens 2000
!rawctx validate ./my-package --json
Python API:
import rawctx
result = rawctx.search("semantic model", registry="https://api.rawctx.dev", sort="similarity")
pkg = rawctx.info("@scope/name", registry="https://api.rawctx.dev")
model = rawctx.load("@scope/name", registry="https://api.rawctx.dev")
prompt = rawctx.to_prompt(
"@scope/name",
datasets=["customers", "order_item"],
max_tokens=2000,
registry="https://api.rawctx.dev",
)
print(model.format_name) # "osi" or "metricflow"
print(model.datasets) # normalized dataset names
print(model.measures) # [Measure(name="...", ...)]
print(model.dimensions) # [Dimension(name="...", ...)]
print(model.relationships) # [Relationship(name="...", ...)]
print(prompt)
print(pkg["model_paths"])
snapshot_dir = rawctx.snapshot_download("@scope/name", registry="https://api.rawctx.dev")
model_path = rawctx.download("@scope/name", "models/customers.yml", registry="https://api.rawctx.dev")
validation = rawctx.validate("./my-package")
semantic = rawctx.semantic_diff("./pkg-v1", "./pkg-v2")
prompt_diff = rawctx.prompt_diff("./pkg-v1", "./pkg-v2", max_tokens=2000)
combined = rawctx.diff_artifacts("./pkg-v1", "./pkg-v2")
Async Python API:
import asyncio
import rawctx
async def main():
async with rawctx.AsyncRawctxClient(registry="https://api.rawctx.dev") as client:
result = await client.search("semantic model", sort="similarity")
model = await client.load("@scope/name")
prompt = await client.to_prompt("@scope/name", datasets=["customers", "order_item"])
snapshot_dir = await client.snapshot_download("@scope/name")
diff_report = await client.diff("./pkg-v1", "./pkg-v2")
return result, model, prompt, snapshot_dir, diff_report
asyncio.run(main())
to_prompt() Behavior
rawctx.to_prompt() turns a package snapshot into compact LLM context. It uses the same normalized semantic objects as load(), then applies package metadata, dataset filters, and prompt budget settings to render agent-ready text.
The same prompt compiler is available from the CLI:
rawctx to-prompt @scope/name --datasets customers,order_item --max-tokens 2000
rawctx to-prompt @scope/name --mode strict_metric --metrics mrr --render-format xml
rawctx to-prompt @scope/name --json
The rendered prompt keeps a predictable section shape:
Domain: {domain} ({package_name})
Models:
...
Datasets:
...
Metrics:
...
Relationships:
...
Dataset filters preserve the requested order, drop duplicates, and fail with UsageError: Unknown dataset(s): ... when a requested dataset is not present. Selecting a subset keeps context focused on the requested datasets and their relevant relationships.
max_tokens is a practical size target, not a model-specific tokenizer guarantee. When the budget is tight, rawctx prioritizes high-signal semantic context and compacts lower-priority detail. Use return_context=True when you need the selected objects, estimated size, render hash, omissions, and warnings for logging or review.
Download Behavior
downloadfetches one file listed inmanifest.modelssnapshot-downloadmaterializes the full extracted package tree- for native MetricFlow packages,
snapshot-downloadis the primary handoff because it restores the full dbt-style snapshot load()andto_prompt()normalize both OSI and native MetricFlow packages into the same typed Python structures- when using
snapshot-download --local-dir, prefer a new or empty directory.--forceonly replaces an existing rawctx snapshot directory and refuses to wipe the current working directory or unrelated folders indexedpackages remain preview-only and cannot be downloaded directly
Validate / Pack / Publish
validate, pack, and publish all start from a local package directory.
validate: checks the manifest and validates the package according tomanifest.formatpack: builds a deterministic local.rawctx.tar.gzpublish: validates again, rebuilds a temporary archive, calculates the checksum, uploads bytes, and completes the version
Published versions are immutable release artifacts. Private workspaces can optionally require approval before a
published version is promoted to latest. When that governance policy is enabled, publish still creates the version,
but latest moves only after the request is reviewed and approved in rawctx Hub. When governance is disabled, direct
latest promotion keeps the existing lightweight behavior.
Package directories are no longer OSI-only.
OSI package example:
my-osi-package/
rawctx.yaml
README.md
models/
sales_summary.osi.yaml
customers.osi.yaml
Native MetricFlow package example:
my-metricflow-package/
rawctx.yaml
README.md
dbt_project.yml
models/
customers.yml
orders.yml
Native MetricFlow manifest example:
name: "@demo/jaffle-metrics"
version: "1.0.0"
format: "metricflow"
source_format: "metricflow"
description: "Native MetricFlow package"
models:
- models/customers.yml
- models/orders.yml
include:
- dbt_project.yml
repository: "https://github.com/dbt-labs/jaffle-sl-template"
Notes:
formatsupportsosiandmetricflowmodelsmust stay relative and must resolve inside the package directoryincludeis optional and is mainly useful for native packages that need extra project files such asdbt_project.yml- standalone file validation is still limited to manifest files and OSI files, so
rawctx validate models/customers.ymlis not a native MetricFlow file validator by itself
Convert Workflow
Inspect-first OSI flow:
rawctx convert --from metricflow --to osi ./my-dbt-project --output ./dist/pkg
rawctx validate ./dist/pkg --json
rawctx pack ./dist/pkg --output-dir ./dist --json
Publish Directly From dbt
Convert to OSI and publish:
rawctx login
rawctx publish --from-dbt ./my-dbt-project --emit-package ./dist/pkg
Publish a native MetricFlow package:
rawctx login
rawctx publish --from-dbt ./my-dbt-project --native --emit-package ./dist/native-pkg
rawctx publish --from-dbt ./my-dbt-project --native --package-name @your-scope/jaffle-shop --package-version 1.2.3
Use --emit-package when you want the generated package directory to remain on disk after the publish run.
Latest Promotion Governance
Governance is about changing the official pointer, not editing the artifact:
published immutable version
|
request latest promotion in rawctx Hub
|
review diff or prompt preview when required
|
approval threshold reached
|
latest resolves to that concrete version
Workspace admins can enable approval before latest promotion, set the required approval count, choose whether requesters may self-approve, and require semantic diff review. The first governance surface is in the authenticated Hub UI: workspace settings configure the policy, and package version pages create, approve, reject, or cancel latest promotion requests.
CLI and Python consumers do not need a separate governance command to use the result. They keep using exact refs or the approved latest pointer:
rawctx snapshot-download @scope/name@1.2.3
rawctx to-prompt @scope/name@latest --max-tokens 1200
If current latest changes while a request is pending, rawctx marks that request stale instead of moving latest from an unexpected base version. Existing pending requests keep the approval threshold captured when the request was created.
Auth Flow (Auto + Fallback)
- Run
rawctx login. - CLI opens or prints the OAuth URL from
POST /api/auth/loginand falls back to the legacy GitHub endpoint if needed. - Complete login in the browser.
- CLI automatically polls OAuth session status and captures
id_tokenwhen the registry supports it. - CLI calls
POST /api/auth/tokenand stores the API token in~/.rawctx/config.yaml.
Manual fallback:
rawctx login --id-token '<JWT>'
Config and Environment
Config file (default): ~/.rawctx/config.yaml
registry: "https://api.rawctx.dev"
auth:
token: "rxctx_..."
token_id: "uuid"
token_name: "rawctx-cli"
issued_at: "2026-02-28T00:00:00+00:00"
profile:
username: "owner"
Environment overrides:
RAWCTX_CONFIG(config path)RAWCTX_REGISTRY(registry URL)RAWCTX_TOKEN(auth token)
Priority: CLI option > env var > config > default.
Offline Mode
--offline is supported for:
searchinfodownloadsnapshot-download
Cache paths:
- index:
~/.rawctx/cache/packages.json - archives:
~/.rawctx/cache/archives/@scope/name/<version>.rawctx.tar.gz - snapshots:
~/.rawctx/packages/@scope/name/<version>/
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rawctx-0.3.26.tar.gz.
File metadata
- Download URL: rawctx-0.3.26.tar.gz
- Upload date:
- Size: 174.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c6d47ca88981988c3fe0da227f1b05132a8e1df26308e56ffe4be4bd16690311
|
|
| MD5 |
5d12ef72ff52e45d4898645a84aaa5cc
|
|
| BLAKE2b-256 |
26779dcbddaefdd176f5122621a5d5581f1e90e618f2193232c86319cbb6a649
|
File details
Details for the file rawctx-0.3.26-py3-none-any.whl.
File metadata
- Download URL: rawctx-0.3.26-py3-none-any.whl
- Upload date:
- Size: 142.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eb5d8f3c4be31d13618c42dce9fc11050acd11e8031032412343a4df122110a8
|
|
| MD5 |
3371c6a487870580fe50f58107d6e7a2
|
|
| BLAKE2b-256 |
11ce7ed63c47d390119db67c607035cf1d8d40fd5f0ab81f6d9e4256eae8f3c3
|