Durable, dependency-free Python SDK for capturing AI-agent runs and shipping them to Intencion.

These details have not been verified by PyPI

Project links

Homepage

Project description

intencion

Durable, dependency-free Python SDK for capturing AI-agent runs and shipping them to Intencion. Pure stdlib, Python 3.8+, non-blocking background transport.

Install

pip install intencion

Quickstart: the whole integration is one line

Wrap your model client once. From then on every call is captured automatically, with the model, token usage, latency, and outcome filled in for you, and the intent inferred on the server:

from openai import OpenAI
import intencion

intencion.init(api_key="in_pk_...")              # call once at startup
client = intencion.instrument_openai(OpenAI())   # the whole integration

# Use the client normally. A tracked run shows up in Intencion for every call.
client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "where is my order?"}],
)

That is the entire setup: an API key plus one line. Anthropic and Gemini work the same way with instrument_anthropic and instrument_gemini, and each call's intent is inferred on the server (for example order_status). See Auto-instrumentation for streaming, framework-built clients, and pinning a fixed intent.

Let your AI assistant wire it up

Point your editor's AI (Claude, Cursor) at this README plus your agent file and ask:

Instrument this agent with the intencion package: init() once, auto-instrument the model client, wrap each user turn in intencion.run and the whole conversation in intencion.session(conversation_id, user=user_id), record tool calls with run.tool, and flush() before the process exits. Keep the diff minimal.

Everything below is enough context for that to one-shot.

Concepts: run, step, session, trace

A run is one unit of agent work: one goal, with an input, an outcome, and the steps taken to get there. A step is one action inside a run, a single model call or tool call. Runs and steps nest: the run is the goal, its steps are the moves.

A session and a trace are two independent ways to group runs. A session gathers the runs of one continuous span of agent work — a chat, a long task, a background job — by session id and (optional) user. A trace gathers the runs of one task into a causal tree, a parent run with its sub-agent runs underneath. They sit on separate axes, so a run carries a session and a trace at once, and one session can hold many traces.

For a single request handled in one unit of work, one run and its steps are the whole picture. Sessions and traces start to matter once a session spans many runs or a task fans out into sub-agent runs. See Sessions and Traces.

Group work into runs (for richer signal)

Auto-instrumentation captures one run per model call. To group an agent's work into a meaningful unit (one user turn, one task), wrap it in intencion.run(...). Model calls inside the block fold in as steps on that run. You don't name the intent. Leave it off and Intencion infers a real, per-run label from the input:

import intencion

intencion.init(api_key="in_pk_...")           # call once at startup

with intencion.run(input=user_msg, user="u_123",     # intent inferred ("auto")
                   session="s_1", model="gpt-4o") as run:
    run.step(name="lookup_order", tool="db", status="success", ms=42)
    result = my_agent(user_msg)               # your agent work
    # outcome defaults to "success"; override with run.fail("...")

# decorator form
@intencion.trace(capture_input=True)
def classify(msg): ...

result = intencion.flush()                    # force send queued runs
if result.dropped:                            # confirm it landed, don't assume
    ...                                       # something is misconfigured

Pass intent= explicitly only when you want deterministic grouping under a label you control (e.g. intent="checkout"). Don't hardcode a single constant like intent="chat-turn" on every run, since that buckets distinct work under one label; leaving it as "auto" gives a real intent per run. If the wrapped code raises, the run is recorded as failure and the exception is re-raised unchanged.

Runs are batched and sent on a background worker, so you normally never touch the transport. Two things make delivery observable instead of silent: flush() returns a FlushResult (sent / dropped / queued) so you can confirm telemetry actually reached the server, and a rejected API key (or other auth/config error) prints a one-time warning even with debug=False, so a bad key never drops every run quietly. (Using capture before init() warns once too.)

Outcomes

Outcomes are deterministic: no model judges success. A run(...) block that exits normally is success; one that raises is failure. But agents usually catch their errors and reply anyway, so "the block returned" is not "the user was helped." Three things stop failures from being silently counted as success:

1. run.tool(...): record a tool call without forgetting its status. It times the call, marks the step success or (on raise) error with the message, and returns the value (or re-raises):

with intencion.run(intent="refund_request", input=msg) as run:
    order = run.tool("lookup_order", "orders-db", lambda: lookup_order(oid))
    refund = run.tool("issue_refund", "payments", lambda: issue_refund(order))
    # the tool kind is optional: run.tool("lookup_order", lambda: lookup_order(oid))

2. Caught tool errors are a reliability signal, not an outcome. If the block exits normally but a run.tool() step errored (and you recovered), the error is recorded on the step and surfaced as a reliability signal; the run stays success on the goal axis. To treat an errored step as a failure, opt in with a resolver: classify_outcome=lambda run: "failure" if run.has_errored_steps else None.

3. Declarative classification. Centralize outcome logic with a global classify_outcome resolver instead of scattering run.fail() calls:

intencion.init(
    api_key="in_pk_...",
    classify_outcome=lambda run: "failure" if not run.steps
                     else "failure" if run.has_errored_steps else None,
)

4. confirm_outcome: close the "successful read still didn't help" gap. A global resolver asked "was the user's goal actually met?". Unlike classify_outcome (structural), it can inspect the run's business result, which you feed in with run.set_result(...) (the context-manager API has no return value), so you can downgrade a run whose tools all succeeded but whose result was empty, deterministically, with no judge model:

intencion.init(
    api_key="in_pk_...",
    # a search run that returned zero hits didn't meet the goal
    confirm_outcome=lambda run: "failure"
        if isinstance(run.last_result, dict) and run.last_result.get("hits") == [] else None,
)

with intencion.run(intent="search") as run:
    hits = run.tool("query", "search-index", lambda: search(q))
    run.set_result({"hits": hits})   # confirm_outcome sees run.last_result

Precedence: explicit ok()/fail() → confirm_outcome → classify_outcome → return/raise default (a returned run is success).

5. Built-in heuristics: label the common failures with no code. confirm_outcome_from_heuristics() returns a ready-made confirm_outcome with stable reason codes. By default it flags empty_output (an empty/whitespace string answer) and no_results (an empty collection like [] or {"hits": []}). Still deterministic, still no judge model:

import intencion
from intencion import confirm_outcome_from_heuristics

intencion.init(api_key="in_pk_...", confirm_outcome=confirm_outcome_from_heuristics())

Not every run is a conversational answer, so the answer-shaped checks are conservative: refusal detection is opt-in (refusal_phrases=True, or pass your own list), and missing_output=True also flags a run that returned nothing at all. Scope it to the intents that are conversational with intents=["support", "chat"] (or a skip callable), point get_text at a nested answer field when the result isn't a plain string (e.g. lambda r: r["message"]["content"]), and set results_keys for your own empty-collection keys.

Labeling failures. Any resolver can return {"outcome": ..., "reason": ...} instead of a bare outcome string; on a "failure" the reason becomes the run's failure_reason, so failures group by mode on the dashboard:

confirm_outcome=lambda run: {"outcome": "failure", "reason": "no_results"}
    if isinstance(run.last_result, list) and len(run.last_result) == 0 else None,

Auto-instrumentation (zero per-call code)

Wrap your OpenAI or Anthropic client once and every model call is captured automatically (model, token usage, latency, and outcome) with no run.step(...) calls:

from openai import OpenAI
import intencion

intencion.init(api_key="in_pk_...")
client = intencion.instrument_openai(OpenAI())   # the whole integration

# Just use the client. A run shows up in Intencion for every call.
client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "where is my order?"}],
)

Calls made inside an intencion.run(...) block become steps on that run, and their model + token usage are folded into it.
Calls made outside a run emit a standalone one-call run. Its intent defaults to "auto", which the server infers into a real label (e.g. order_status) from the input.
Sync, async (AsyncOpenAI / AsyncAnthropic), and streaming calls are all supported; iteration is transparent.

client = intencion.instrument_anthropic(Anthropic())
# Pin a fixed intent, or skip prompt capture:
client = intencion.instrument_openai(OpenAI(), intent="support", capture_input=False)

Patching is at the class level, so it covers every client instance, including the ones agent frameworks (LangChain, the OpenAI Agents SDK, LlamaIndex, Instructor) build internally. You can pass a client, or call with no argument to patch the installed package directly:

intencion.instrument_openai()       # patches the openai package (covers framework-built clients)
intencion.instrument_anthropic()    # patches the anthropic package

It instruments create, parse (structured outputs), and the stream() helper, across sync/async. instrument_* is idempotent and never raises; enable debug=True logging to see which methods were patched (it warns loudly if it found nothing, so a miss isn't silent).

For streamed OpenAI chat completions, the call is always captured, but token counts arrive only if you pass stream_options={"include_usage": True} (an OpenAI requirement). Anthropic streaming and OpenAI Responses streaming capture tokens with no extra flag.

Gemini is covered too: intencion.instrument_gemini(client) patches google-genai's models.generate_content / generate_content_stream (sync + async) at the class level.

Not yet auto-instrumented (roadmap): stacks that don't call the official SDK — the Vercel AI SDK, CrewAI/LiteLLM, and raw boto3 Bedrock. Capture these today by wrapping calls in intencion.run() and recording steps with run.tool(); for stacks that emit OpenTelemetry (e.g. the Vercel AI SDK) you can alternatively stream their OTel export into the OpenTelemetry ingest endpoint. To group a multi-call task into one run tree, wrap it (see Traces).

Capturing content (output, tool I/O, request context)

By default the SDK captures full content so a failing run is diagnosable in-product: the prompt and reply text, reconstructed tool-call input/output, and the request context the model ran with — its system prompt, tool definitions, and sampling params (temperature, max_tokens, …). All content is run through PII + secret redaction before send. Switch to metadata-only capture with capture_content=False:

intencion.init(api_key="in_pk_...", capture_content=False)

With content capture on (the default), auto-instrumented model calls fold their reply text onto the run as output_text (streamed replies are reassembled from chunks), tool calls become steps with their input/output, and the system prompt, tool definitions, and request params are stored on the run. run.tool(name, kind, fn) records the tool's return value as the step's output. You can always attach step I/O explicitly: run.step(name=..., input=..., output=...).

Redaction catches the built-in PII + secret patterns (emails, cards, SSNs, phone numbers, API keys, tokens). Captured content can carry other sensitive data (names, account details), so supply a redactor or redact_patterns to scrub what's specific to your domain.

Backfill existing logs (no re-instrumentation)

Already have OpenAI / Anthropic logs on disk or in a warehouse? Import them as runs without touching your app, so you see value before changing any code. Each log becomes one run: input, model, tokens, and tool-call steps are extracted with the same parser live capture uses:

# one log dict, or a list. `request` is the body you sent; `response` is what you got back.
intencion.import_openai([
    {"request": saved_request, "response": saved_response,
     "intent": "support", "id": log_id, "user": user, "session": session},
])
intencion.import_anthropic({"request": request, "response": response, "id": log_id})

# provider-agnostic (CSV/JSONL/your own shape):
intencion.import_runs([
    {"intent": "checkout", "input": msg, "model": "gpt-4o", "outcome": "success", "steps": [...]},
])

intencion.flush()

Put a stable id on a record (the provider's chatcmpl-… id or your own conversation id) and re-importing is idempotent: the id is the server's dedupe key. Import bypasses sampling (an explicit backfill is always captured) and is redacted like any other run.

Custom redaction

Redaction is on by default (redact=True). To plug in your own scrubbing, pass a redactor (fully replaces the built-in email/card/SSN/phone patterns) or redact_patterns (extra (pattern, replacement) applied after the built-ins). Preview exactly what would be removed before sending real traffic:

intencion.init(api_key="in_pk_...", redact_patterns=[(r"\bacct_\w+\b", "<ACCT>")])

intencion.preview_redaction("ref acct_777 for jane@example.com")
# {"redacted": "ref <ACCT> for <EMAIL>", "matches": [{"value": "acct_777", "replacement": "<ACCT>"}, ...]}

Sessions

Tie a whole session together — one continuous span of agent work (a chat, a long task, a background job). Every run created inside an intencion.session(...) block (an intencion.run(...) or an auto-instrumented call) inherits the session (and optional user) and is grouped by session_id, with no plumbing:

with intencion.session("s_123", user="u_42"):
    client.chat.completions.create(...)   # session_id = s_123
    client.chat.completions.create(...)   # same session

# imperative form for request handlers where wrapping a block isn't convenient:
intencion.set_session("s_123", user="u_42")
intencion.clear_session()

Nested sessions override; an explicit session= or user= on intencion.run(...) still wins over the ambient session.

Example: a multi-turn chat agent

Sessions aren't chat-specific, but a chat agent is the clearest example. Wrap the session in intencion.session(...) and each turn in intencion.agent_turn(...). The model calls + tool calls inside fold into steps under that one run, so an N-message conversation is N runs grouped by session_id: one run per turn, in order. No per-turn intent: each turn is labeled automatically from its input, so distinct asks land as distinct intents instead of one bucket:

def handle_turn(conversation_id: str, user_id: str, message: str) -> str:
    with intencion.session(conversation_id, user=user_id):
        with intencion.agent_turn(input=message) as run:   # intent inferred per turn
            while True:
                resp = client.messages.create(model=MODEL, tools=tools, messages=messages)  # captured as a step
                if resp.stop_reason != "tool_use":
                    return final_text
                for call in tool_uses(resp):
                    run.tool(call.name, "tool", lambda: exec_tool(call))   # tool step; error recorded on the step

Call handle_turn(...) for each message with the SAME conversation_id.

Traces

A trace groups the runs of one task into a causal tree. A multi-agent task reads as a parent run with its sub-agent runs nested underneath, so a failure pins to the exact sub-agent that caused it. A session is one continuous span of work (grouped by session id and user); a trace is one task (grouped by cause and effect). The two are independent axes.

Nesting is automatic. An intencion.run(...) opened inside another becomes its child: the two share a trace_id, and the child carries the parent's id as parent_run_id.

with intencion.run(intent="research_task"):         # the task: a trace root
    with intencion.run(intent="search"):
        ...                                           # a sub-agent run
    with intencion.run(intent="summarize"):
        ...                                           # another sub-agent run

Auto-instrumented model calls inside a run stay steps on that run; only an explicit intencion.run(...) becomes a child run. To group sibling runs that aren't nested, wrap them in intencion.task(...) (named task because intencion.trace is the run decorator):

with intencion.task():                   # both runs below are roots of one trace
    with intencion.run(intent="plan"):
        ...
    with intencion.run(intent="act"):
        ...

# or join a trace propagated across a process boundary:
with intencion.task(incoming_trace_id):
    with intencion.run(intent="step"):
        ...

A run with no enclosing trace and no nesting stands alone, exactly as before. OpenTelemetry exporters get the same shape automatically: a trace whose spans carry agent/chain boundaries (an intencion.intent, an OpenInference AGENT/CHAIN span, or intencion.run_boundary) splits into the matching run tree. Traces show up under Traces in the dashboard.

Short-lived processes

The worker flushes on an interval, on atexit, and on SIGTERM/SIGINT. For a script, a serverless function, or any process that exits quickly, call flush() before the process ends to ensure queued runs are sent:

intencion.flush()      # block until queued runs are sent (or timeout)
intencion.shutdown()   # flush + stop the worker thread

Configuration

Option	Default	Meaning
`api_key`	(required)	Sent as `Authorization: Bearer <api_key>`.
`endpoint`	`https://intencion.io/api/ingest`	Ingest URL.
`flush_interval`	`5.0`	Seconds between timed flushes.
`max_batch`	`100`	Max runs per request (hard-capped at 500).
`max_queue`	`1000`	Bounded queue size; drop-oldest when full.
`sample_rate`	`1.0`	Fraction of runs captured (0.0 to 1.0).
`disabled`	`False`	Disable all capture.
`debug`	`False`	Enable debug logging on the `intencion` logger.
`redact`	`True`	Scrub PII (emails, cards, SSNs, phones) before send.
`redactor`	`None`	Custom `str -> str` that fully replaces the built-in PII patterns.
`redact_patterns`	`None`	Extra `[(pattern, replacement), ...]` applied after the built-ins.
`capture_content`	`True`	Capture content (redacted): reply text, tool I/O, system prompt, tool definitions, request params. Set `False` for metadata-only.
`confirm_outcome`	`None`	Goal-level resolver `lambda run: Outcome \| {"outcome", "reason"} \| None`; sees `run.last_result` / `run.produced_output` (set via `run.set_result(...)`). Runs before `classify_outcome`. A `reason` becomes the run's `failure_reason`.
`classify_outcome`	`None`	Structural resolver returning an outcome string, a `{"outcome", "reason"}` dict, or `None`, for un-set outcomes.

To validate capture locally without the real endpoint, point init(endpoint=...) at a tiny local HTTP server and inspect the POSTed { "events": [run, ...] } body. Each run carries intent_label (your intent; stays "auto" if you let the server infer it), session_id, user_ref, steps (with per-step status/error), outcome, tokens_in/out, and latency_ms:

import http.server, json, threading
events = []
class H(http.server.BaseHTTPRequestHandler):
    def do_POST(self):
        body = self.rfile.read(int(self.headers["Content-Length"]))
        events.extend(json.loads(body)["events"])
        self.send_response(200); self.end_headers(); self.wfile.write(b"{}")
    def log_message(self, *a): pass
srv = http.server.HTTPServer(("127.0.0.1", 8799), H)
threading.Thread(target=srv.serve_forever, daemon=True).start()
intencion.init(api_key="test", endpoint="http://127.0.0.1:8799/api/ingest", flush_interval=0.1)
# ... run your agent, then intencion.flush(); assert events[0]["outcome"] == ...

API

intencion.init(api_key, endpoint=None, flush_interval=5.0, max_batch=100,
               max_queue=1000, sample_rate=1.0, disabled=False, debug=False,
               redact=True, redactor=None, redact_patterns=None,
               confirm_outcome=None, classify_outcome=None)

# Build a deterministic confirm_outcome (see Outcomes)
intencion.confirm_outcome_from_heuristics(empty_output=True, missing_output=False,
               no_results=True, refusal_phrases=False, outcome_for_refusal="failure",
               get_text=None, intents=None, skip=None)

intencion.run(intent=None, input=None, user=None, session=None, model=None)
# use as a context manager (with statement)

intencion.trace(intent=None, user=None, session=None, model=None, capture_input=False)
# use as a function decorator

intencion.flush(timeout=None)
intencion.shutdown(timeout=2.0)

# Auto-instrument a provider client (every call is captured automatically)
intencion.instrument_openai(client, intent="auto", capture_input=True)
intencion.instrument_anthropic(client, intent="auto", capture_input=True)
intencion.instrument_gemini(client, intent="auto", capture_input=True)

# Backfill existing logs (no app changes); see "Backfill existing logs"
intencion.import_openai(logs)        # one dict or a list of {"request","response",...}
intencion.import_anthropic(logs)
intencion.import_runs(records)        # provider-agnostic {"intent","input","steps",...}

# Dry-run what redaction would scrub before sending
intencion.preview_redaction(text)     # {"redacted": ..., "matches": [...]}

intencion.current_run()   # the run in scope inside a run() block, or None

A run object exposes: step(name, status="success", tool=None, ms=None, error=None), tool(name, tool=None, fn=...) (runs fn, records the step + status, returns its value), ok(), fail(reason=None), set_tokens(tokens_in, tokens_out), set_model(model), and the has_errored_steps property.

License

MIT. See LICENSE.

https://intencion.io

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.9.0

Jun 18, 2026

0.8.1

Jun 17, 2026

0.8.0

Jun 16, 2026

0.7.0

Jun 15, 2026

0.5.0

Jun 5, 2026

0.4.0

Jun 5, 2026

0.3.0

Jun 4, 2026

0.2.1

Jun 4, 2026

0.2.0

Jun 4, 2026

0.1.1

Jun 4, 2026

0.1.0

Jun 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

intencion-0.9.0.tar.gz (73.5 kB view details)

Uploaded Jun 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

intencion-0.9.0-py3-none-any.whl (58.3 kB view details)

Uploaded Jun 18, 2026 Python 3

File details

Details for the file intencion-0.9.0.tar.gz.

File metadata

Download URL: intencion-0.9.0.tar.gz
Upload date: Jun 18, 2026
Size: 73.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for intencion-0.9.0.tar.gz
Algorithm	Hash digest
SHA256	`53d4450457f11ddf4382c9c12115c5fd162ffb62737b1cc88ccd66e8ccd4bcd7`
MD5	`e302de221fc74d45c2e99f6e2d224816`
BLAKE2b-256	`347e2ae9d68b1b9858aa4e0355275badc90da992e2698c88f0a8699453c208e5`

See more details on using hashes here.

File details

Details for the file intencion-0.9.0-py3-none-any.whl.

File metadata

Download URL: intencion-0.9.0-py3-none-any.whl
Upload date: Jun 18, 2026
Size: 58.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for intencion-0.9.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1c38fc88144a36c46689c0021c2adc50885b9baf5f757ea626a2e401df4e2807`
MD5	`c4f25625030af4f943e4688be1622272`
BLAKE2b-256	`a0d7bd36433abea25f9d0a89a1c6d72e300becc5b5ca44d23b85566b68ae2d19`

See more details on using hashes here.

intencion 0.9.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

intencion

Install

Quickstart: the whole integration is one line

Let your AI assistant wire it up

Concepts: run, step, session, trace

Group work into runs (for richer signal)

Outcomes

Auto-instrumentation (zero per-call code)

Capturing content (output, tool I/O, request context)

Backfill existing logs (no re-instrumentation)

Custom redaction

Sessions

Example: a multi-turn chat agent

Traces

Short-lived processes

Configuration

API

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes