Forensic-grade tamper-evident audit chain for LLM applications. HMAC-SHA256 chain, content-addressable storage, pre-call policy gate, regression detection.

These details have not been verified by PyPI

Project links

Project description

BIJOTEL

Forensic-grade tamper-evident audit chain for LLM applications.

BIJOTEL adds tamper-evidence (HMAC-SHA256 chain), content-addressable storage, and pre-call policy gating to existing OpenTelemetry GenAI pipelines (OpenLLMetry, custom instrumentations, etc.). It does NOT replace your tracer — it extends it.

Status: v1.0.0 — production-ready core (chain + CAS + policy + regression). Layers (fingerprint, AST safety, routing, misalignment probes, Combo D containment) are stable. API surface frozen for v1.x.

Install

pip install bijotel

Optional extras:

pip install bijotel[anthropic]     # Anthropic SDK + instrumentation
pip install bijotel[openai]        # OpenAI SDK
pip install bijotel[api]           # FastAPI + uvicorn (for `bijotel serve`)
pip install bijotel[fingerprint]   # sentence-transformers (semantic dedup)
pip install bijotel[ast]           # tree-sitter (bash AST safety)
pip install bijotel[all]           # everything above

Quickstart

import os
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider

from bijotel.processors import HmacChainSpanProcessor, CasSpanProcessor

provider = TracerProvider()
provider.add_span_processor(
    HmacChainSpanProcessor(
        secret_key=bytes.fromhex(os.environ["BIJOTEL_HMAC_SECRET"]),
        db_path="chain.db",
    )
)
provider.add_span_processor(CasSpanProcessor(db_path="chain.db"))
trace.set_tracer_provider(provider)

# Now any OTel-instrumented LLM call is sealed in the chain.

Verify integrity later:

bijotel verify --db chain.db

Features (13/20 bijuterii catalog patterns covered)

#1 Permitted/Safe/Sealed — three-question safety frame (Combo D)
#2 Content-Addressable Storage + Merkle DAG — dedup + reference graph
#5 AST-First Code Safety — tree-sitter bash + stdlib Python AST scan
#7 Deterministic + Semantic Fingerprinting — SHA-256 + embeddings
#10 Compliance-as-Code — PII / output-length / model-pin / cost rules
#11 Forensic-First (HMAC chain) — JCS + SHA-256 + HMAC tamper-evidence
#15 Inference Routing — Pareto cost/quality/latency selector + budget
#16 Regression Detection — z-score + IQR drift detection on tokens/cost
#18 Misalignment Probes — 29 builtin probes across 8 attack categories
Plus: provider adapters (Anthropic, OpenAI), @trace_genai decorator, portable signed JSON chain export.

Docker

docker run -p 8080:8080 \
    -v $(pwd)/data:/data \
    -e BIJOTEL_HMAC_SECRET=$(openssl rand -hex 32) \
    bijotel/bijotel:1.0.0

See docker-compose.yml in the repo for the full reference deploy.

Architecture

BIJOTEL is a plug-in. You keep your existing OpenTelemetry tracer (e.g., opentelemetry-instrumentation-anthropic). BIJOTEL adds three reusable SpanProcessors:

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.instrumentation.anthropic import AnthropicInstrumentor

from bijotel.processors import (
    HmacChainSpanProcessor,    # F2: tamper-evident audit chain
    CasSpanProcessor,          # F3: content-addressable storage
    PolicyGateSpanProcessor,   # F4: in-process policy gate
)

provider = TracerProvider()
provider.add_span_processor(HmacChainSpanProcessor(secret_key="..."))
provider.add_span_processor(CasSpanProcessor(store_path="./cas.db"))
provider.add_span_processor(PolicyGateSpanProcessor(rules=[...]))
trace.set_tracer_provider(provider)

AnthropicInstrumentor().instrument()  # tracer rămâne upstream

Custom Code Tracing (`@trace_genai`)

For LLM calls outside instrumentation-anthropic coverage (custom wrappers, non-Anthropic providers, multi-provider clients), use the @trace_genai decorator or bijotel.wrap() runtime equivalent:

from bijotel import trace_genai

# Anthropic-style API: defaults work
@trace_genai(provider="anthropic")
def call_claude(*, model, messages, max_tokens):
    return client.messages.create(model=model, messages=messages, max_tokens=max_tokens)

# Custom API: provide extractors (e.g. for multi-provider wrappers)
@trace_genai(
    name="ara.llm.call",
    provider="ara",
    request_extractor=lambda kw: {
        "model": kw["cfg"].model_id,
        "messages": kw["messages"],
        "max_tokens": kw["cfg"].max_tokens,
    },
    response_extractor=lambda resp: {
        "input_tokens": resp.input_tokens,
        "output_tokens": resp.output_tokens,
    },
    extra_attrs={"ara.deployment": "prod"},  # constants only
)
async def complete(self, *, agent_id, messages, cfg, ...):
    return await self._dispatch(...)

Auto-detects sync/async via asyncio.iscoroutinefunction. All emitted spans pass through HmacChain/CAS/Policy processors normally. Exceptions in the wrapped function set span status to ERROR and re-raise. Extractor failures log to bijotel.extractor_error attribute without crashing the call.

bijotel.wrap(fn, ...) is the runtime alternative — same behavior, no source modification needed (third-party libs, dynamic dispatch).

Note: dual audit when combining `@trace_genai` with `AnthropicInstrumentor`

If you decorate a function that internally calls client.messages.create() while AnthropicInstrumentor().instrument() is active, two spans are emitted per call:

Outer span: from @trace_genai (your wrapper boundary)
Inner span: from AnthropicInstrumentor (the SDK call itself)

Both are sealed in the chain. This is intentional — the outer span captures your application context (e.g. ara.agent_id, ara.org_id), the inner span captures the raw SDK request/response. Together they give you full audit coverage at two granularities.

If you want only one audit layer, choose one approach:

Decorator only (single span per logical call): don't call AnthropicInstrumentor().instrument()
Instrumentation only (single span per SDK call): don't decorate your wrapper

Storage cost of dual audit: ~2× span count. For most workloads this is trivial; for high-volume production, pick one layer.

Provider Adapters (F7)

Provider Protocol unifies LLM provider integration. Adapters implement contract methods, enabling clean @trace_genai integration via provider=adapter shorthand:

from bijotel import trace_genai
from bijotel.adapters import AnthropicAdapter

adapter = AnthropicAdapter()

@trace_genai(provider=adapter)
async def my_call(*, model, messages, max_tokens):
    return await adapter.complete(
        messages=messages, model=model, max_tokens=max_tokens
    )

The decorator auto-extracts:

gen_ai.provider.name from adapter.name
Request attrs from adapter.extract_request_attrs()
Response attrs from adapter.extract_response_attrs()

Explicit request_extractor= / response_extractor= always override adapter-supplied methods (escape hatch preserved).

Calling the adapter directly returns a normalized ProviderResponse:

response = await adapter.complete(
    messages=[{"role": "user", "content": "hi"}],
    model="claude-haiku-4-5-20251001",
    max_tokens=20,
)
print(response.text, response.input_tokens, response.output_tokens)

Available adapters:

AnthropicAdapter — Anthropic Claude (uses anthropic.AsyncAnthropic). Install: pip install bijotel[anthropic].
OpenAIAdapter — OpenAI GPT (uses openai.AsyncOpenAI). Install: pip install bijotel[openai].

from bijotel import trace_genai
from bijotel.adapters import OpenAIAdapter

adapter = OpenAIAdapter()

@trace_genai(provider=adapter)
async def call_gpt(*, model, messages, max_tokens):
    return await adapter.complete(
        messages=messages, model=model, max_tokens=max_tokens
    )

# Direct call:
response = await adapter.complete(
    messages=[{"role": "user", "content": "hi"}],
    model="gpt-4o-mini",
    max_tokens=20,
)

Same Provider Protocol, same ProviderResponse shape — only the SDK underneath differs. F7 validated empirical with two consumers (Anthropic + OpenAI).

Adding new providers — subclass Provider:

from bijotel.adapters import Provider, ProviderResponse

class OpenAIAdapter(Provider):
    @property
    def name(self) -> str:
        return "openai"

    def extract_request_attrs(self, kwargs): ...
    def extract_response_attrs(self, response): ...

    async def complete(self, *, messages, model, max_tokens, **kwargs):
        raw = await self.client.chat.completions.create(...)
        return ProviderResponse(
            text=raw.choices[0].message.content,
            model=raw.model,
            input_tokens=raw.usage.prompt_tokens,
            output_tokens=raw.usage.completion_tokens,
            response_id=raw.id,
            finish_reason=raw.choices[0].finish_reason,
            raw_response=raw,
        )

Backward-compatible: passing provider="anthropic" (string) still works exactly as in F5 — Provider object is opt-in.

Policy Gate

The PolicyEngine evaluates pre-call rules against request payload (model, messages, max_tokens, …) and returns a Decision (allow / warn / deny). Use the guard decorator for the typical "wrap an LLM call" pattern, or call PolicyEngine directly for custom integration.

`PolicyEngine` direct usage

from bijotel import PolicyEngine, cost_per_call_max, model_allowlist

engine = PolicyEngine(rules=[
    cost_per_call_max(usd=0.50),
    model_allowlist("claude-haiku-4-5", "claude-sonnet-4-20250514"),
])

request = {"model": "claude-haiku-4-5", "messages": [...], "max_tokens": 100}
decision = engine.evaluate(request)

if decision.is_deny:
    print(f"Blocked by {decision.rule}: {decision.reason}")
elif decision.is_warn:
    print(f"Warning from {decision.rule}: {decision.reason}")  # call still proceeds
else:
    print("Allowed")

engine.evaluate() short-circuits on first deny. Warnings are collected and attached as bijotel.policy.warning attributes on emitted spans. See Decision and State classes in bijotel.policy.decision.

`model_allowlist`

Restrict which models can be called via your wrapper. Useful for cost control + audit.

from bijotel import model_allowlist

# Deny if model not in list
rule = model_allowlist("claude-haiku-4-5", "claude-sonnet-4-20250514", mode="deny")

# Warn-only mode (audit + proceed)
rule_audit = model_allowlist("claude-haiku-4-5", mode="warn")

`prompt_pattern_deny` (F11)

Block prompts matching jailbreak / prompt-injection regex patterns before the SDK call is made. Five attack categories covered out of the box: instruction override ("ignore previous instructions"), system prompt extraction ("reveal your system prompt"), role override ("you are now a different AI"), jailbreak framing ("DAN mode", "developer mode"), encoding bypass (base64:, rot13). Defaults are case-insensitive and applied via re.search.

from bijotel import prompt_pattern_deny

# Defaults only (DEFAULT_JAILBREAK_PATTERNS, ~15 patterns, 5 categories)
rule = prompt_pattern_deny()

# Custom patterns appended to defaults (defaults checked first)
rule = prompt_pattern_deny(
    patterns=[r"my_company_secret", r"\bAPI[_-]KEY\b"],
)

# Custom patterns only — defaults disabled
rule = prompt_pattern_deny(
    patterns=[r"sensitive_term"], use_defaults=False
)

# Warn mode — audit but allow (recommended for first deployment)
rule_audit = prompt_pattern_deny(mode="warn")

Handles both Anthropic SDK (messages=[{"role": "user", "content": "..."}]) and Anthropic multipart format (content=[{"type": "text", "text": "..."}]), plus OpenAI-style messages — extracts and concatenates text content from all roles before matching.

Suggested rollout: deploy in mode="warn" first to surface false positives via bijotel.policy.warning span attributes, review for ~1 week, then flip to mode="deny". False positives are easier to diagnose than false negatives in this domain.

Pattern catalog adapted from substrate-guard's agent_safety.rego dangerous_patterns concept (separate project, read-only access). The substrate-guard version targets filesystem / network / shell actions; this BIJOTEL adaptation targets LLM prompts (instruction overrides, system-prompt extraction, role overrides, jailbreak framings, encoding bypass).

`PolicyDeniedError`

Raised by guard() decorator when a rule returns Decision.deny. Catch it in your application code to surface a useful message:

from bijotel import guard, PolicyDeniedError, cost_per_call_max

@guard(rules=[cost_per_call_max(usd=0.10)])
def call_llm(*, model, messages, max_tokens):
    return client.messages.create(model=model, messages=messages, max_tokens=max_tokens)

try:
    response = call_llm(model="claude-opus-4-7", messages=[...], max_tokens=4000)
except PolicyDeniedError as e:
    print(f"Policy denied: rule={e.rule!r}, reason={e.reason!r}")
    # → returns to user instead of leaking expensive call

Chain export — programmatic API

CLI is the typical use, but export_chain and verify_export are exposed as public functions for programmatic integration (e.g. scheduled audit-trail uploads, CI verification jobs):

from pathlib import Path
from bijotel import export_chain, verify_export

secret = bytes.fromhex("<your hex secret>")  # min 16 bytes

# Export
out = export_chain(
    db_path=Path("/data/bijotel_chain.db"),
    output_path=Path("/var/audit/audit_2026-05-10.json"),
    secret_key=secret,
)
# → "/var/audit/audit_2026-05-10.json"

# Verify (auditor side, only needs secret + JSON file)
valid, reason = verify_export(out, secret)
if not valid:
    raise RuntimeError(f"Audit trail tampered: {reason}")

Schema: bijotel-chain-v1. Per-entry HMAC + file-level chain_signature. Integrity verifiable with shared secret only — no SQLite access required.

Regression Detection (F12, Bijuteria #16)

Detect drift in token usage / cost over time using z-score + IQR methods on the BIJOTEL chain.db. Empirically motivated by patterns observed during GENA deployment (T+2h checkpoint revealed bimodal quality distributions and dimension-specific bottlenecks worth monitoring temporally).

Programmatic API

from bijotel import RegressionDetector, AnomalyMethod

detector = RegressionDetector(
    db_path="chain.db",
    baseline_window=100,        # Use last 100 spans as baseline
    z_threshold=3.0,            # Flag values > 3σ from mean
    iqr_multiplier=1.5,         # Tukey-style IQR outlier
    method=AnomalyMethod.BOTH,  # Require BOTH methods to flag (low FP)
)

# Single dimension
anomalies = detector.detect("input_tokens")
for a in anomalies:
    print(f"  seq={a.seq} value={a.value} z={a.z_score:.2f} severity={a.severity}")

# All 3 dimensions (input_tokens, output_tokens, cost)
results = detector.detect_all_dimensions(filter_model="claude-haiku-4-5-20251001")

CLI usage

# Scan all 3 dimensions on entire chain (default: last 50 spans vs prior 100)
bijotel regression --db chain.db

# Single dimension, specific model
bijotel regression --db chain.db --dimension cost --model claude-sonnet-4-20250514

# Custom baseline window + sensitivity
bijotel regression --db chain.db --window 200 --z-threshold 2.5

Exit codes: 0 no anomalies, 1 anomalies detected, 2 invalid args.

Detection methods

z-score (parametric): z = (value - baseline.mean) / baseline.stdev. Fast for Gaussian-like signals (most token counts when calls are similar).
IQR (non-parametric, Tukey): flag if value < p25 - k·iqr OR value > p75 + k·iqr. Robust to heavy-tailed distributions (cost can spike).
AnomalyMethod.BOTH (default): flags only when BOTH agree → minimizes false positives. Use Z_SCORE or IQR alone for broader detection.

Severity levels

anomaly — both z-score AND IQR triggered (high confidence drift).
warning — only one method triggered (worth review, lower confidence).

Limitations

Requires ≥5 baseline samples (MIN_SAMPLES); insufficient data returns empty list (no anomalies, but no false negatives surfaced either).
Cost dimension requires model in DEFAULT_PRICES price table (see policy/prices.py); spans with unknown models contribute no cost datapoint.
Single chain.db per RegressionDetector instance — no cross-chain analysis in v0.3.0.

Shutting down BIJOTEL

shutdown() flushes any pending spans and tears down the global TracerProvider. Important when running scripts that exit immediately (without flush, last spans may be lost).

from bijotel import init, shutdown

init(...)
# ... do work, emit spans ...
shutdown()  # flushes processors, releases resources

shutdown() is idempotent — safe to call multiple times.

Development install

git clone <repo>
cd BIJOTEL
pip install -e ".[anthropic,api,fingerprint,ast,dev]"
pytest

CLI

After install, the bijotel command is available:

# Verify chain integrity (requires HMAC secret)
export BIJOTEL_HMAC_SECRET=<hex>
bijotel verify --db chain.db

# Inspect a span (by hex span_id or integer seq)
bijotel inspect --db chain.db 1
bijotel inspect --db chain.db abc123def456

# Summary stats (chain + CAS + policy daily state)
bijotel stats --db chain.db

# List spans with filters
bijotel list --db chain.db
bijotel list --db chain.db --blocked
bijotel list --db chain.db --rule cost_per_call_max
bijotel list --db chain.db --model claude-haiku-4-5-20251001
bijotel list --db chain.db --since 2026-05-07 --limit 100

# Export chain to portable signed JSON (verifiable by external auditors)
bijotel export --db chain.db --output audit_trail.json

# Verify integrity of an exported JSON (no DB needed, just secret)
bijotel verify-export audit_trail.json

# Run the HTTP API server (requires `pip install bijotel[api]`)
bijotel serve --port 8080 --db chain.db
# GET /health, /version, /docs (OpenAPI / Swagger UI)

--since uses calendar date UTC (YYYY-MM-DD, lower bound 00:00:00Z), consistent with daily_token_budget rule.

Validation

End-to-end smoke test on real Anthropic API exercising the full BIJOTEL stack (HmacChain + CAS + PolicyGate + AnthropicInstrumentor + @trace_genai decorator + all 6 CLI commands):

export ANTHROPIC_API_KEY=sk-ant-...
export BIJOTEL_HMAC_SECRET=$(python -c "import secrets; print(secrets.token_hex(32))")
python scripts/e2e_smoke.py

Cost: ~$0.001 per run (3-4 real Haiku calls; denied calls don't hit network).

The script validates:

Chain integrity end-to-end (bijotel verify returns VALID)
CAS dedup on identical input (ref_count > 1 for repeated calls)
Policy gate enforcement (denied calls produce synthetic spans, no SDK call)
All 6 CLI subcommands return exit 0
Custom @trace_genai decorator works alongside AnthropicInstrumentor

Roadmap

Shipped in v1.0.0:

F0–F6: Core (skeleton → init → HMAC chain → CAS → policy gate → decorator → CLI)
F7: Provider protocol + AnthropicAdapter + OpenAIAdapter
F8: Portable signed JSON chain export
F11: prompt_pattern_deny (regex jailbreak/injection detection)
F12: Regression detection (z-score + IQR over tokens/cost)
F13: Deterministic + semantic fingerprinting layer
F14: AST safety layer (tree-sitter bash + stdlib Python ast)
F15: Inference routing (Pareto cost/quality/latency + budget)
F16: CAS Merkle DAG (content-addressable + reference graph)
F17: Misalignment probe library (29 probes × 8 attack categories)
F18: Combo D containment guard (Policy + AST + chain seal)
Compliance rules: PII / output-length / model-pin
CLI: verify + inspect + stats + list + export + verify-export + regression + serve
Hardening: WAL + busy_timeout + BEGIN IMMEDIATE, crash isolation, perms, lockfile
FastAPI bijotel serve (health + version, full chain/policy/regression in v1.1.0)
Docker image + docker-compose example

Planned:

v1.1.0 — FastAPI chain/policy/regression endpoints
v1.2.0 — Dashboard (chain explorer + policy + regression)
v1.3.0 — Consensus voting (Bijuteria #9) + energy accounting (#3)

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.0.6

May 25, 2026

2.0.5

May 24, 2026

2.0.4

May 24, 2026

2.0.3

May 24, 2026

2.0.2

May 24, 2026

2.0.1

May 24, 2026

2.0.0

May 24, 2026

1.9.1

May 24, 2026

1.9.0

May 24, 2026

1.8.0

May 24, 2026

1.7.0

May 24, 2026

1.6.0

May 24, 2026

1.5.3

May 23, 2026

1.5.2

May 23, 2026

1.5.1

May 23, 2026

1.5.0

May 23, 2026

1.4.0

May 22, 2026

1.1.0

May 22, 2026

This version

1.0.0

May 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bijotel-1.0.0.tar.gz (176.7 kB view details)

Uploaded May 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

bijotel-1.0.0-py3-none-any.whl (97.1 kB view details)

Uploaded May 22, 2026 Python 3

File details

Details for the file bijotel-1.0.0.tar.gz.

File metadata

Download URL: bijotel-1.0.0.tar.gz
Upload date: May 22, 2026
Size: 176.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for bijotel-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`f804390c937fd0a960e5bad02c16574e0d7d2d0b053f6dd2c71c7fe075650336`
MD5	`1b41c57f5a0a1d02c569f8488a086e5f`
BLAKE2b-256	`01c7fdcfa08b69e6da86189af6736a4cf96c80b575a600e4a07ca43654272391`

See more details on using hashes here.

File details

Details for the file bijotel-1.0.0-py3-none-any.whl.

File metadata

Download URL: bijotel-1.0.0-py3-none-any.whl
Upload date: May 22, 2026
Size: 97.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for bijotel-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7ffd25e14f4aa5e46e4b26f252263555f43e3d27eee26d8353ccc7a24442ac42`
MD5	`edd0600137d1589af6b0c8f618afa341`
BLAKE2b-256	`df2e9796a582eae0887b2a69540fdc1d0ebaa36340ab032f9c2e1e89cdc780d3`

See more details on using hashes here.

bijotel 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

BIJOTEL

Install

Quickstart

Features (13/20 bijuterii catalog patterns covered)

Docker

Architecture

Custom Code Tracing (@trace_genai)

Note: dual audit when combining @trace_genai with AnthropicInstrumentor

Provider Adapters (F7)

Policy Gate

PolicyEngine direct usage

model_allowlist

prompt_pattern_deny (F11)

PolicyDeniedError

Chain export — programmatic API

Regression Detection (F12, Bijuteria #16)

Programmatic API

CLI usage

Detection methods

Severity levels

Limitations

Shutting down BIJOTEL

Development install

CLI

Validation

Roadmap

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Custom Code Tracing (`@trace_genai`)

Note: dual audit when combining `@trace_genai` with `AnthropicInstrumentor`

`PolicyEngine` direct usage

`model_allowlist`

`prompt_pattern_deny` (F11)

`PolicyDeniedError`