Skip to main content

Agent telemetry SDK for envpod — two functions, full observability

Project description

envpod-coat

Two functions. Full agent observability. Zero runtime dependencies.

from envpod_coat import coat, tie

@coat wraps sync and async functions. It emits span lifecycle events with the function name, input and output previews, timing, and trace/span IDs that the envpod runtime correlates with pod audit events.

When you turn on auto_capture=True for a sync function, the decorator also captures a sanitized snapshot of the function's local variables on return. auto_capture is off by default, and async wrappers deliberately skip local-variable capture — use tie() for intermediate state in async code.

tie() reports from inside a function when you need intermediate state. The metaphor: wear the coat, tie up the loose ends inside it.

emit is exported as an alias for tie if your fingers prefer the standard observability verb. Op is exported as an optional convenience enum (@coat(Op.LLM)@coat("llm")).

Together with envpod's kernel telemetry, you get three-layer observability: what the agent thought, what the machine did, and who authorized it.

Status

v0.1.16 — telemetry only. The decision plane (Allow / Deny / Modify / Queue gates) lands in v0.3.0. This release ships the reference producer that emits canonical AOT (aotp/0.1) envelopes to the envpod pod runtime.

New in v0.1.16:

  • Op enum — convenience typing shortcut. @coat(Op.LLM) is the same as @coat("llm"); strings still work for custom domain tags.
  • auto_capture is binary + off by default — turn it on per function with @coat(Op.LLM, auto_capture=True) or globally via ENVPOD_COAT_CAPTURE=redacted. Env var wins over decorator.
  • PID-aware correlation — every coat envelope travels with the caller PID. The runtime combines emitted span fields, PID, and time windows to join coat spans with DNS, vault, and other pod events. Ambiguous kernel events (e.g. overlapping async coroutines on the same PID) render with a [heuristic] tag in envpod audit --group-by-span; precise attribution via trace/span propagation INTO kernel events ships in v0.1.17.
  • Server-side identity injectionsubject.agent_id is overwritten by the SO_PEERCRED-resolved claim; a tampered SDK cannot forge a registered-agent identity.
  • Anonymous fallback — callers that haven't claimed yet get envpod.claim_kind = "anonymous" plus an anonymous fingerprint, distinct from named agents in dashboards.
  • Framework adaptersenvpod-coat[langchain] and envpod-coat[claude-agent] extras replace manual wrapping with a one-line registration. Four more frameworks follow in v0.1.17.
  • OpenInference OTEL export — every coat event whose kind maps to an OpenInference span kind (LLM / TOOL / AGENT / CHAIN / RETRIEVER) emits OpenInference attributes alongside envpod.* on the same OTLP log record. Phoenix, Arize, and Datadog LLM render coat spans natively when you point pod.yaml's otel.endpoint at them. See docs/AGENT-COAT.md → Transport for the attribute table.

Install

pip install envpod-coat

Python 3.10+. Zero runtime dependencies.

Quickstart

from envpod_coat import coat, tie

@coat("llm")
def ask_claude(messages):
    response = client.messages.create(
        model="claude-sonnet-4-6",
        messages=messages,
    )
    return response

@coat("tool")
def write_file(path, content):
    with open(path, "w") as f:
        f.write(content)

@coat("tool")
def process_batch(rows):
    for i, batch in enumerate(chunk(rows, 100)):
        result = transform(batch)
        tie("batch", {"i": i, "size": len(batch)})
    return result

response = ask_claude([{"role": "user", "content": "refactor main.py"}])
write_file("main.py", response.content[0].text)

That's it. Every call emits telemetry to envpod. Add auto_capture=True when you want local variable capture on a specific sync function. The pod runtime uses the emitted span IDs for coat-to-coat relationships and combines PID with time windows to line up coat spans with kernel events (DNS, vault, etc.).

Identity-aware coat events

Every coat envelope carries a Layer-C identity field set server-side — the SDK does NOT self-claim. Your in-pod process declares a registered agent once via the main envpod SDK, and subsequent @coat events inherit that claim automatically:

from envpod import InPodClient
with InPodClient() as c:
    c.claim("reviewer", ttl_secs=3600)   # claim first

from envpod_coat import coat, Op
@coat(Op.LLM)                            # then coat events carry
def review(code): ...                    # reviewer's claim_id

The claim-binding flow lives in the main envpod package because the kernel trust root is SO_PEERCRED on identity.sock. envpod-coat does not expose a claim() function — the decorator consumes claimed identity; it does not produce it.

Dashboards distinguish claimed vs. anonymous callers by the envpod.claim_kind attribute. Unclaimed processes still get traceable audit entries under an anonymous fingerprint (envpod.anonymous.*) but stay out of the registered-agent namespace.

Transport

Inside an envpod pod (ENVPOD_POD_ID set, /run/envpod/queue.sock present), events go through the queue socket and become entries in audit.jsonl interleaved with kernel events.

Outside a pod, events fall through to stderr as JSON lines prefixed with [coat] so you can develop locally and see your telemetry. When you deploy inside a pod, kernel correlation activates automatically.

Failure mode

Telemetry is best-effort. If the queue socket is missing or the write fails, the SDK logs one warning and falls back to stderr. The agent is never blocked on telemetry delivery. Fail-closed semantics arrive in v0.3.0 with the decision plane.

Wire format

Canonical AOT (Agent Operation Taxonomy) envelope. Server-side identity injection fills in subject and envpod.* fields on the pod runtime, so the SDK itself never needs to declare identity.

Claimed caller (process has called InPodClient.claim(...)):

{
  "spec": "aotp/0.1",
  "kind": "agent.llm.request",
  "ts": "2026-04-14T01:30:00.123Z",
  "ids": {
    "session_id": "sess_abcd1234",
    "run_id": "run_abcd1234",
    "step_id": "step_abcd1234",
    "trace_id": "tr_abcd1234",
    "span_id": "sp_abcd1234",
    "parent_span_id": ""
  },
  "subject": {
    "agent_id": "reviewer",
    "agent_role": "code-review",
    "framework": "envpod-coat",
    "framework_version": "0.1.16"
  },
  "capture": { "mode": "redacted" },
  "payload": {
    "fn": "ask_claude",
    "preview": { "args": [], "kwargs": {} }
  },
  "envpod": {
    "pod_id": "pod_demo",
    "pid": 4242,
    "claim_kind": "claimed",
    "agent_claim_id": "jti_xyz",
    "capabilities": ["read", "commit"],
    "session": "sess_xyz"
  }
}

Anonymous caller (no claim yet — the registered-agent namespace stays untouched; a traceable fingerprint rides in envpod.anonymous instead):

{
  "spec": "aotp/0.1",
  "kind": "agent.llm.request",
  "subject": { "agent_id": "", "agent_role": "" },
  "envpod": {
    "pod_id": "pod_demo",
    "pid": 999,
    "uid": 60000,
    "claim_kind": "anonymous",
    "anonymous": {
      "pid": 999,
      "uid": 60000,
      "session": "sess_999",
      "fingerprint": "anon_pid999_uid60000_sesssess_999"
    }
  }
}

See ideas/agent-coat/OPERATION-TAXONOMY.md in the envpod repo for the full spec of the 16 core operations (agent.session.*, agent.run.*, agent.step.*, agent.llm.*, agent.tool.*, agent.memory.*, agent.subagent.*, agent.human.request).

Environment variables

Variable Default Purpose
ENVPOD_POD_ID unset When set, transport prefers the queue socket.
ENVPOD_SOCKET /run/envpod/queue.sock Override socket path.
ENVPOD_COAT_CAPTURE unset (→ none) none / redacted / hash / summary / raw. Turns capture on globally. Overrides decorator.
ENVPOD_COAT_PREVIEW_LEN 200 Max chars per captured value.
ENVPOD_AGENT_ID unset Populates envelope.subject.agent_id on non-envpod transports. Inside a pod, the daemon overrides this from the kernel-authenticated claim.
ENVPOD_AGENT_ROLE unset Populates envelope.subject.agent_role on non-envpod transports.

Licence

Apache-2.0 with explicit patent grant. The envpod runtime stays Premium proprietary; this SDK is permissive so any agent framework can adopt it.

Roadmap

  • v0.1.15 — Two-function Python SDK, queue socket transport, canonical AOT emission.
  • v0.1.16 (this release) — Op enum, binary auto_capture, PID correlation, server-side identity injection, anonymous fallback, LangChain + Claude Agent SDK adapters, envpod audit --all --group-by-span CLI.
  • v0.1.17 — Remaining adapters (OpenAI Agents, CrewAI, AutoGen, Pydantic AI), OverlayFS fanotify PID coverage, SDK→daemon span-context registration (precise kernel event attribution).
  • v0.3.0 — Governance gates: SDK upgrades to mediated mode against the new envpod-coatd daemon.
  • v0.3.1 — TypeScript SDK.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

envpod_coat-0.1.16.tar.gz (37.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

envpod_coat-0.1.16-py3-none-any.whl (28.8 kB view details)

Uploaded Python 3

File details

Details for the file envpod_coat-0.1.16.tar.gz.

File metadata

  • Download URL: envpod_coat-0.1.16.tar.gz
  • Upload date:
  • Size: 37.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for envpod_coat-0.1.16.tar.gz
Algorithm Hash digest
SHA256 aa75685a1cd651dd5866cefba75842fbae026c0a1ae6a64c1adb14c5d6abc501
MD5 8b2ece2b8de221e87912b28c4ed8419d
BLAKE2b-256 6547cd57abc9b8ea02d3425d52302619e56fa960b9f0e0f1eefc83bfca7165fc

See more details on using hashes here.

File details

Details for the file envpod_coat-0.1.16-py3-none-any.whl.

File metadata

  • Download URL: envpod_coat-0.1.16-py3-none-any.whl
  • Upload date:
  • Size: 28.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for envpod_coat-0.1.16-py3-none-any.whl
Algorithm Hash digest
SHA256 b8552d3e6b4f7c1d04ecee25a21094d0a73497315da50254bc98aae53aece06b
MD5 854acea08d936de4c53eae6d37ae71fc
BLAKE2b-256 1d7328475e22147e1710d9fc8c288ffdf2b7ee540b6ad668b705b568941027fd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page