Skip to main content

The foundational shared event schema for the LLM Developer Toolkit

Project description

llm-toolkit-schema

llm-toolkit-schema

The shared language every LLM tool speaks.
A lightweight Python library that gives your AI applications a common, structured way to record, sign, redact, and export events — with zero mandatory dependencies.

PyPI version Python versions Monthly downloads 100% test coverage 1214 tests Zero dependencies Documentation MIT license


What is this?

Think of llm-toolkit-schema as a universal receipt format for your AI application. Every time your app calls a language model, makes a decision, redacts private data, or checks a guardrail — this library gives that action a consistent, structured record that any tool in your stack can read.

Without a shared schema, every team invents their own log format. With llm-toolkit-schema, your logs, dashboards, compliance reports, and monitoring tools all speak the same language — automatically.


Why use it?

Without llm-toolkit-schema With llm-toolkit-schema
Each service logs events differently Every event follows the same structure
Hard to audit who saw what data Built-in HMAC signing creates a tamper-proof audit trail
PII scattered across logs First-class PII redaction before data leaves your app
Vendor-specific observability OpenTelemetry-compatible — works with any monitoring stack
No way to check compatibility CLI + programmatic compliance checks in CI
Complex integration glue Zero required dependencies — just pip install

Install

pip install llm-toolkit-schema
import llm_toolkit_schema  # that's it — no configuration needed

Requires Python 3.9 or later. No other packages are required for core usage.

Optional extras

pip install "llm-toolkit-schema[jsonschema]"   # strict JSON Schema validation
pip install "llm-toolkit-schema[http]"         # Webhook + OTLP export
pip install "llm-toolkit-schema[pydantic]"     # Pydantic v2 model layer
pip install "llm-toolkit-schema[otel]"         # OpenTelemetry SDK integration
pip install "llm-toolkit-schema[kafka]"        # EventStream.from_kafka() via kafka-python
pip install "llm-toolkit-schema[langchain]"    # LangChain callback handler
pip install "llm-toolkit-schema[llamaindex]"   # LlamaIndex event handler
pip install "llm-toolkit-schema[datadog]"      # Datadog APM + metrics exporter
pip install "llm-toolkit-schema[all]"          # everything above

Five-minute tour

1 — Record an event

from llm_toolkit_schema import Event, EventType, Tags

event = Event(
    event_type=EventType.TRACE_SPAN_COMPLETED,
    source="my-app@1.0.0",          # who emitted this
    org_id="org_acme",              # your organisation
    payload={
        "model": "gpt-4o",
        "prompt_tokens": 512,
        "completion_tokens": 128,
        "latency_ms": 340.5,
    },
    tags=Tags(env="production"),
)

event.validate()         # raises if structure is invalid
print(event.to_json())   # compact JSON string, ready to store or ship

Every event gets a ULID (a time-sortable unique ID) automatically — no need to generate one yourself.


2 — Redact private information before logging

from llm_toolkit_schema.redact import Redactable, RedactionPolicy, Sensitivity

policy = RedactionPolicy(min_sensitivity=Sensitivity.PII, redacted_by="policy:gdpr-v1")

# Wrap any string that might contain PII
prompt = Redactable("Call me at 555-867-5309", sensitivity=Sensitivity.PII)

result = policy.apply({"prompt": prompt})
# result["prompt"] → "[REDACTED by policy:gdpr-v1]"

Redactable is a string wrapper. You mark fields as sensitive at the point where they're created; the policy decides what to remove before the event is written to any log.


3 — Sign events for tamper-proof audit trails

from llm_toolkit_schema.signing import sign_event, verify_chain, AuditStream

# Sign a single event
signed = sign_event(event, org_secret="my-org-secret")

# Or build a chain — every event references the one before it,
# so any gap or modification is immediately detectable.
stream = AuditStream(org_secret="my-org-secret")
for e in events:
    stream.append(e)

is_valid, violations = verify_chain(stream.events, org_secret="my-org-secret")

This is the same principle used in certificate chains and blockchain — each event's signature covers the previous event's signature, so you cannot alter history without breaking the chain.


4 — Export to anywhere

from llm_toolkit_schema.stream import EventStream
from llm_toolkit_schema.export.jsonl import JSONLExporter
from llm_toolkit_schema.export.webhook import WebhookExporter
from llm_toolkit_schema.export.otlp import OTLPExporter
from llm_toolkit_schema.export.datadog import DatadogExporter
from llm_toolkit_schema.export.grafana import GrafanaLokiExporter

stream = EventStream(events)

# Write everything to a local file
await stream.drain(JSONLExporter("events.jsonl"))

# Ship to your OpenTelemetry collector
await stream.drain(OTLPExporter("http://otel-collector:4318/v1/traces"))

# Send to Datadog APM (traces + metrics)
await stream.drain(DatadogExporter(
    service="my-app",
    env="production",
    agent_url="http://dd-agent:8126",
    api_key="your-dd-api-key",
))

# Push to Grafana Loki
await stream.drain(GrafanaLokiExporter(
    url="http://loki:3100",
    labels={"app": "my-app", "env": "production"},
))

# Fan-out: guard-blocked events → Slack webhook
await stream.route(
    WebhookExporter("https://hooks.slack.com/your-webhook"),
    predicate=lambda e: e.event_type == "llm.guard.blocked",
)

Kafka source

from llm_toolkit_schema.stream import EventStream

# Drain a Kafka topic directly into an EventStream
stream = EventStream.from_kafka(
    topic="llm-events",
    bootstrap_servers="kafka:9092",
    group_id="analytics",
    max_messages=5000,
)
await stream.drain(exporter)

5 — Check compliance from the command line

llm-toolkit-schema check-compat events.json
✓  CHK-1  All required fields present          (500 / 500 events)
✓  CHK-2  Event types valid                    (500 / 500 events)
✓  CHK-3  Source identifiers well-formed       (500 / 500 events)
✓  CHK-5  Event IDs are valid ULIDs            (500 / 500 events)
All checks passed.

Drop this into your CI pipeline and catch schema drift before it reaches production.


What's inside the box

ModuleWhat it doesFor whom
llm_toolkit_schema.event The core Event envelope — the one structure all tools share Everyone
llm_toolkit_schema.types All built-in event type strings (trace, cost, cache, eval, guard…) Everyone
llm_toolkit_schema.redact PII detection, sensitivity levels, redaction policies Data privacy / GDPR teams
llm_toolkit_schema.signing HMAC-SHA256 event signing and tamper-evident audit chains Security / compliance teams
llm_toolkit_schema.compliance Programmatic v1.0 compatibility checks — no pytest required Platform / DevOps teams
llm_toolkit_schema.export Ship events to files (JSONL), HTTP webhooks, OTLP collectors, Datadog APM, or Grafana Loki Infra / observability teams
llm_toolkit_schema.stream Fan-out router — one drain() call reaches multiple backends; Kafka source via from_kafka() Platform engineers
llm_toolkit_schema.validate JSON Schema validation against the published v1.0 schema All teams
llm_toolkit_schema.consumer Declare schema-namespace dependencies; fail fast at startup if version requirements aren’t met Platform / integration teams
llm_toolkit_schema.governance Policy-based event gating — block prohibited types, warn on deprecated usage, enforce custom rules Platform / compliance teams
llm_toolkit_schema.deprecations Register and surface per-event-type deprecation notices at runtime Library maintainers
llm_toolkit_schema.integrations Plug-in adapters for LangChain (LLMSchemaCallbackHandler) and LlamaIndex (LLMSchemaEventHandler) App developers
llm_toolkit_schema.namespaces Typed payload dataclasses for all 10 built-in event namespaces Tool authors
llm_toolkit_schema.models Optional Pydantic v2 models for teams that prefer validated schemas API / backend teams

Event namespaces

Every event carries a payload — a dictionary whose shape is defined by the event's namespace. The ten built-in namespaces cover everything from raw model traces to safety guardrails:

Namespace prefix Dataclass What it records
llm.trace.* TracePayload Model call — tokens, latency, finish reason (frozen v1)
llm.cost.* CostPayload Per-call cost in USD
llm.cache.* CachePayload Cache hit/miss, backend, TTL
llm.eval.* EvalScenarioPayload Scores, labels, evaluator identity
llm.guard.* GuardPayload Safety classifier output, block decisions
llm.fence.* FencePayload Topic constraints, allow/block lists
llm.prompt.* PromptPayload Prompt template version, rendered text
llm.redact.* RedactPayload PII audit record — what was found and removed
llm.diff.* DiffPayload Prompt/response delta between two events
llm.template.* TemplatePayload Template registry metadata
from llm_toolkit_schema.namespaces.trace import TracePayload

payload = TracePayload(
    model="gpt-4o",
    prompt_tokens=512,
    completion_tokens=128,
    latency_ms=340.5,
    finish_reason="stop",
)

event = Event(
    event_type="llm.trace.span.completed",
    source="my-app@1.0.0",
    payload=payload.to_dict(),
)

Quality standards

  • 1 214 tests — unit, integration, property-based (Hypothesis), and performance benchmarks
  • 100 % line and branch coverage — no dead code ships
  • Zero required dependencies — the entire core runs on Python's standard library alone
  • Typed — full py.typed marker; works with mypy and pyright out of the box
  • Frozen v1 trace schemallm.trace.* payload fields will never break between minor releases

Project structure

llm_toolkit_schema/
├── event.py          ← The Event envelope (start here)
├── types.py          ← EventType enum
├── signing.py        ← HMAC signing & audit chains
├── redact.py         ← PII redaction
├── validate.py       ← JSON Schema validation
├── consumer.py       ← Consumer registry & schema-version compatibility
├── governance.py     ← Event governance policies
├── deprecations.py   ← Per-event-type deprecation tracking
├── compliance/       ← Compatibility checklist suite
├── export/
│   ├── jsonl.py      ← Local file export
│   ├── webhook.py    ← HTTP POST export
│   ├── otlp.py       ← OpenTelemetry export
│   ├── datadog.py    ← Datadog APM traces + metrics
│   └── grafana.py    ← Grafana Loki export
├── stream.py         ← EventStream fan-out router (+ Kafka source)
├── integrations/
│   ├── langchain.py  ← LangChain callback handler
│   └── llamaindex.py ← LlamaIndex event handler
├── namespaces/       ← Typed payload dataclasses
│   ├── trace.py        (frozen v1)
│   ├── cost.py
│   ├── cache.py
│   └── …
├── models.py         ← Optional Pydantic v2 models
└── migrate.py        ← Schema migration helpers & v2 roadmap

Development setup

git clone https://github.com/llm-toolkit/llm-toolkit-schema.git
cd llm-toolkit-schema

python -m venv .venv
.venv\Scripts\activate          # Windows
# source .venv/bin/activate     # macOS / Linux

pip install -e ".[dev]"
pytest                          # run all 1 214 tests
Code quality commands
ruff check .                  # linting
ruff format .                 # auto-format
mypy llm_toolkit_schema       # type checking
pytest --cov                  # tests + coverage report
Build the docs locally
pip install -e ".[docs]"
cd docs
sphinx-build -b html . _build/html   # open _build/html/index.html

Compatibility & versioning

This project follows Semantic Versioning:

  • Patch releases (1.0.x) — bug fixes only, fully backwards-compatible
  • Minor releases (1.x.0) — new features, backwards-compatible
  • Major releases (x.0.0) — breaking changes, announced in advance

The llm.trace.* namespace payload schema is additionally frozen at v1: even a major release will not remove or rename fields from TracePayload.


Changelog

See docs/changelog.md or the release history on PyPI.


Contributing

Contributions are welcome! Please read the Contributing Guide first, then open an issue or pull request.

Key rules:

  • All new code must maintain 100 % test coverage
  • Follow the existing Google-style docstrings
  • Run ruff and mypy before submitting

License

MIT — free for personal and commercial use.


Made with care for the LLM Developer Toolkit ecosystem.
PyPI · Docs · Quickstart · API Reference · Report a bug

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_toolkit_schema-1.1.0.tar.gz (261.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_toolkit_schema-1.1.0-py3-none-any.whl (120.7 kB view details)

Uploaded Python 3

File details

Details for the file llm_toolkit_schema-1.1.0.tar.gz.

File metadata

  • Download URL: llm_toolkit_schema-1.1.0.tar.gz
  • Upload date:
  • Size: 261.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for llm_toolkit_schema-1.1.0.tar.gz
Algorithm Hash digest
SHA256 75617b6b2abd341aab8bbc613939654b050e436695878f250c70cdd03413f364
MD5 da242190319aac3da3c63d8b7bee8dbf
BLAKE2b-256 c4695b333492264d4e1e9f29382827265c1927e896e43672245e6aa59943bf5b

See more details on using hashes here.

File details

Details for the file llm_toolkit_schema-1.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for llm_toolkit_schema-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1a82a878ef1a5f6e28cf9296fab8cc4775b40fcb28311892c9191f5e0584e400
MD5 ae8696a94200e209ddc92aae0a8f56cb
BLAKE2b-256 75feb86f9086e3b08b87311e429d64882d413f4959eb4fa7ddfc97123e96ab32

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page