The foundational shared event schema for the LLM Developer Toolkit
Project description
llm-toolkit-schema
The shared language every LLM tool speaks.
A lightweight Python library that gives your AI applications a common, structured way to record, sign, redact, and export events — with zero mandatory dependencies.
What is this?
Think of
llm-toolkit-schemaas a universal receipt format for your AI application. Every time your app calls a language model, makes a decision, redacts private data, or checks a guardrail — this library gives that action a consistent, structured record that any tool in your stack can read.
Without a shared schema, every team invents their own log format. With llm-toolkit-schema, your logs, dashboards, compliance reports, and monitoring tools all speak the same language — automatically.
Why use it?
| Without llm-toolkit-schema | With llm-toolkit-schema |
|---|---|
| Each service logs events differently | Every event follows the same structure |
| Hard to audit who saw what data | Built-in HMAC signing creates a tamper-proof audit trail |
| PII scattered across logs | First-class PII redaction before data leaves your app |
| Vendor-specific observability | OpenTelemetry-compatible — works with any monitoring stack |
| No way to check compatibility | CLI + programmatic compliance checks in CI |
| Complex integration glue | Zero required dependencies — just pip install |
Install
pip install llm-toolkit-schema
import llm_toolkit_schema # that's it — no configuration needed
Requires Python 3.9 or later. No other packages are required for core usage.
Optional extras
pip install "llm-toolkit-schema[jsonschema]" # strict JSON Schema validation
pip install "llm-toolkit-schema[http]" # Webhook + OTLP export
pip install "llm-toolkit-schema[pydantic]" # Pydantic v2 model layer
pip install "llm-toolkit-schema[otel]" # OpenTelemetry SDK integration
pip install "llm-toolkit-schema[kafka]" # EventStream.from_kafka() via kafka-python
pip install "llm-toolkit-schema[langchain]" # LangChain callback handler
pip install "llm-toolkit-schema[llamaindex]" # LlamaIndex event handler
pip install "llm-toolkit-schema[datadog]" # Datadog APM + metrics exporter
pip install "llm-toolkit-schema[all]" # everything above
Five-minute tour
1 — Record an event
from llm_toolkit_schema import Event, EventType, Tags
event = Event(
event_type=EventType.TRACE_SPAN_COMPLETED,
source="my-app@1.0.0", # who emitted this
org_id="org_acme", # your organisation
payload={
"model": "gpt-4o",
"prompt_tokens": 512,
"completion_tokens": 128,
"latency_ms": 340.5,
},
tags=Tags(env="production"),
)
event.validate() # raises if structure is invalid
print(event.to_json()) # compact JSON string, ready to store or ship
Every event gets a ULID (a time-sortable unique ID) automatically — no need to generate one yourself.
2 — Redact private information before logging
from llm_toolkit_schema.redact import Redactable, RedactionPolicy, Sensitivity
policy = RedactionPolicy(min_sensitivity=Sensitivity.PII, redacted_by="policy:gdpr-v1")
# Wrap any string that might contain PII
prompt = Redactable("Call me at 555-867-5309", sensitivity=Sensitivity.PII)
result = policy.apply({"prompt": prompt})
# result["prompt"] → "[REDACTED by policy:gdpr-v1]"
Redactable is a string wrapper. You mark fields as sensitive at the point where they're created; the policy decides what to remove before the event is written to any log.
3 — Sign events for tamper-proof audit trails
from llm_toolkit_schema.signing import sign_event, verify_chain, AuditStream
# Sign a single event
signed = sign_event(event, org_secret="my-org-secret")
# Or build a chain — every event references the one before it,
# so any gap or modification is immediately detectable.
stream = AuditStream(org_secret="my-org-secret")
for e in events:
stream.append(e)
is_valid, violations = verify_chain(stream.events, org_secret="my-org-secret")
This is the same principle used in certificate chains and blockchain — each event's signature covers the previous event's signature, so you cannot alter history without breaking the chain.
4 — Export to anywhere
from llm_toolkit_schema.stream import EventStream
from llm_toolkit_schema.export.jsonl import JSONLExporter
from llm_toolkit_schema.export.webhook import WebhookExporter
from llm_toolkit_schema.export.otlp import OTLPExporter
from llm_toolkit_schema.export.datadog import DatadogExporter
from llm_toolkit_schema.export.grafana import GrafanaLokiExporter
stream = EventStream(events)
# Write everything to a local file
await stream.drain(JSONLExporter("events.jsonl"))
# Ship to your OpenTelemetry collector
await stream.drain(OTLPExporter("http://otel-collector:4318/v1/traces"))
# Send to Datadog APM (traces + metrics)
await stream.drain(DatadogExporter(
service="my-app",
env="production",
agent_url="http://dd-agent:8126",
api_key="your-dd-api-key",
))
# Push to Grafana Loki
await stream.drain(GrafanaLokiExporter(
url="http://loki:3100",
labels={"app": "my-app", "env": "production"},
))
# Fan-out: guard-blocked events → Slack webhook
await stream.route(
WebhookExporter("https://hooks.slack.com/your-webhook"),
predicate=lambda e: e.event_type == "llm.guard.blocked",
)
Kafka source
from llm_toolkit_schema.stream import EventStream
# Drain a Kafka topic directly into an EventStream
stream = EventStream.from_kafka(
topic="llm-events",
bootstrap_servers="kafka:9092",
group_id="analytics",
max_messages=5000,
)
await stream.drain(exporter)
5 — Check compliance from the command line
llm-toolkit-schema check-compat events.json
✓ CHK-1 All required fields present (500 / 500 events)
✓ CHK-2 Event types valid (500 / 500 events)
✓ CHK-3 Source identifiers well-formed (500 / 500 events)
✓ CHK-5 Event IDs are valid ULIDs (500 / 500 events)
All checks passed.
Drop this into your CI pipeline and catch schema drift before it reaches production.
What's inside the box
| Module | What it does | For whom |
|---|---|---|
llm_toolkit_schema.event |
The core Event envelope — the one structure all tools share |
Everyone |
llm_toolkit_schema.types |
All built-in event type strings (trace, cost, cache, eval, guard…) | Everyone |
llm_toolkit_schema.redact |
PII detection, sensitivity levels, redaction policies | Data privacy / GDPR teams |
llm_toolkit_schema.signing |
HMAC-SHA256 event signing and tamper-evident audit chains | Security / compliance teams |
llm_toolkit_schema.compliance |
Programmatic v1.0 compatibility checks — no pytest required | Platform / DevOps teams |
llm_toolkit_schema.export |
Ship events to files (JSONL), HTTP webhooks, OTLP collectors, Datadog APM, or Grafana Loki | Infra / observability teams |
llm_toolkit_schema.stream |
Fan-out router — one drain() call reaches multiple backends; Kafka source via from_kafka() |
Platform engineers |
llm_toolkit_schema.validate |
JSON Schema validation against the published v1.0 schema | All teams |
llm_toolkit_schema.consumer |
Declare schema-namespace dependencies; fail fast at startup if version requirements aren’t met | Platform / integration teams |
llm_toolkit_schema.governance |
Policy-based event gating — block prohibited types, warn on deprecated usage, enforce custom rules | Platform / compliance teams |
llm_toolkit_schema.deprecations |
Register and surface per-event-type deprecation notices at runtime | Library maintainers |
llm_toolkit_schema.integrations |
Plug-in adapters for LangChain (LLMSchemaCallbackHandler) and LlamaIndex (LLMSchemaEventHandler) |
App developers |
llm_toolkit_schema.namespaces |
Typed payload dataclasses for all 10 built-in event namespaces | Tool authors |
llm_toolkit_schema.models |
Optional Pydantic v2 models for teams that prefer validated schemas | API / backend teams |
Event namespaces
Every event carries a payload — a dictionary whose shape is defined by the event's namespace. The ten built-in namespaces cover everything from raw model traces to safety guardrails:
| Namespace prefix | Dataclass | What it records |
|---|---|---|
llm.trace.* |
TracePayload |
Model call — tokens, latency, finish reason (frozen v1) |
llm.cost.* |
CostPayload |
Per-call cost in USD |
llm.cache.* |
CachePayload |
Cache hit/miss, backend, TTL |
llm.eval.* |
EvalScenarioPayload |
Scores, labels, evaluator identity |
llm.guard.* |
GuardPayload |
Safety classifier output, block decisions |
llm.fence.* |
FencePayload |
Topic constraints, allow/block lists |
llm.prompt.* |
PromptPayload |
Prompt template version, rendered text |
llm.redact.* |
RedactPayload |
PII audit record — what was found and removed |
llm.diff.* |
DiffPayload |
Prompt/response delta between two events |
llm.template.* |
TemplatePayload |
Template registry metadata |
from llm_toolkit_schema.namespaces.trace import TracePayload
payload = TracePayload(
model="gpt-4o",
prompt_tokens=512,
completion_tokens=128,
latency_ms=340.5,
finish_reason="stop",
)
event = Event(
event_type="llm.trace.span.completed",
source="my-app@1.0.0",
payload=payload.to_dict(),
)
Quality standards
- 1 214 tests — unit, integration, property-based (Hypothesis), and performance benchmarks
- 100 % line and branch coverage — no dead code ships
- Zero required dependencies — the entire core runs on Python's standard library alone
- Typed — full
py.typedmarker; works with mypy and pyright out of the box - Frozen v1 trace schema —
llm.trace.*payload fields will never break between minor releases
Project structure
llm_toolkit_schema/
├── event.py ← The Event envelope (start here)
├── types.py ← EventType enum
├── signing.py ← HMAC signing & audit chains
├── redact.py ← PII redaction
├── validate.py ← JSON Schema validation
├── consumer.py ← Consumer registry & schema-version compatibility
├── governance.py ← Event governance policies
├── deprecations.py ← Per-event-type deprecation tracking
├── compliance/ ← Compatibility checklist suite
├── export/
│ ├── jsonl.py ← Local file export
│ ├── webhook.py ← HTTP POST export
│ ├── otlp.py ← OpenTelemetry export
│ ├── datadog.py ← Datadog APM traces + metrics
│ └── grafana.py ← Grafana Loki export
├── stream.py ← EventStream fan-out router (+ Kafka source)
├── integrations/
│ ├── langchain.py ← LangChain callback handler
│ └── llamaindex.py ← LlamaIndex event handler
├── namespaces/ ← Typed payload dataclasses
│ ├── trace.py (frozen v1)
│ ├── cost.py
│ ├── cache.py
│ └── …
├── models.py ← Optional Pydantic v2 models
└── migrate.py ← Schema migration helpers & v2 roadmap
Development setup
git clone https://github.com/llm-toolkit/llm-toolkit-schema.git
cd llm-toolkit-schema
python -m venv .venv
.venv\Scripts\activate # Windows
# source .venv/bin/activate # macOS / Linux
pip install -e ".[dev]"
pytest # run all 1 214 tests
Code quality commands
ruff check . # linting
ruff format . # auto-format
mypy llm_toolkit_schema # type checking
pytest --cov # tests + coverage report
Build the docs locally
pip install -e ".[docs]"
cd docs
sphinx-build -b html . _build/html # open _build/html/index.html
Compatibility & versioning
This project follows Semantic Versioning:
- Patch releases (
1.0.x) — bug fixes only, fully backwards-compatible - Minor releases (
1.x.0) — new features, backwards-compatible - Major releases (
x.0.0) — breaking changes, announced in advance
The llm.trace.* namespace payload schema is additionally frozen at v1: even a major release will not remove or rename fields from TracePayload.
Changelog
See docs/changelog.md or the release history on PyPI.
Contributing
Contributions are welcome! Please read the Contributing Guide first, then open an issue or pull request.
Key rules:
- All new code must maintain 100 % test coverage
- Follow the existing Google-style docstrings
- Run
ruffandmypybefore submitting
License
MIT — free for personal and commercial use.
Made with care for the LLM Developer Toolkit ecosystem.
PyPI ·
Docs ·
Quickstart ·
API Reference ·
Report a bug
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llm_toolkit_schema-1.1.0.tar.gz.
File metadata
- Download URL: llm_toolkit_schema-1.1.0.tar.gz
- Upload date:
- Size: 261.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
75617b6b2abd341aab8bbc613939654b050e436695878f250c70cdd03413f364
|
|
| MD5 |
da242190319aac3da3c63d8b7bee8dbf
|
|
| BLAKE2b-256 |
c4695b333492264d4e1e9f29382827265c1927e896e43672245e6aa59943bf5b
|
File details
Details for the file llm_toolkit_schema-1.1.0-py3-none-any.whl.
File metadata
- Download URL: llm_toolkit_schema-1.1.0-py3-none-any.whl
- Upload date:
- Size: 120.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1a82a878ef1a5f6e28cf9296fab8cc4775b40fcb28311892c9191f5e0584e400
|
|
| MD5 |
ae8696a94200e209ddc92aae0a8f56cb
|
|
| BLAKE2b-256 |
75feb86f9086e3b08b87311e429d64882d413f4959eb4fa7ddfc97123e96ab32
|