Drop-in runtime capture for AI agents. Two lines to install. Logs every LLM call (Anthropic, OpenAI, Google Gemini, Cohere, Mistral) to local JSONL. MIT.
Project description
tessen
The harness for your AI agents. Two lines to install. Every call your agent makes — thinking, tool use, retries, the cost of every cache miss — captured at forensic depth, so you find out what your agent is actually doing in production before your customer does.
Tame the beast in your Agentic Workflows. Use Tessen.
import tessen
tessen.init(agent_name="my_agent")
That's the entire integration. Drop it before you construct your LLM client. Every call from every supported SDK flows through Tessen.
Install
pip install tessen
# or, if you want the optional log viewer
pip install "tessen[viewer]"
The core install has zero hard dependencies. Tessen only patches vendor SDKs that are already importable in your process — install whichever you actually use.
Why
It's 3am. You're paged. Your agent burned $2,000 in API spend in two hours, looping on the same broken tool 47 times because the SDK swallowed a 502 and the agent silently retried. Your dashboard says the request succeeded. Your traces show one span. Your logs are noise.
You cannot defend what you cannot see — and the tools that watch your agent treat each model call like a web request instead of like a program that thinks. Tessen treats it like a program. Every thinking block, every tool call, every cache decision, every retry — recorded structurally, in your process, on your disk. When the page comes in, you have the receipts.
This is Act 1: capture. The wedge that gets the harness in place.
What gets captured
| SDK | what's intercepted |
|---|---|
Anthropic (anthropic) |
messages.create, messages.stream, async equivalents, batches API, and stream=True iterator returns (tee'd transparently) |
OpenAI (openai) |
chat.completions.create, responses.create, legacy completions.create, sync + async |
| Google Gemini | both google.generativeai (legacy) and google.genai (current), sync + async, streaming |
Cohere (cohere) |
chat, chat_stream, async |
Mistral (mistralai) |
chat.complete, chat.stream, async |
Streaming responses (both messages.stream(...) context managers and messages.create(..., stream=True) iterators) are tee'd transparently — your code does not change. The captured event records the assembled final message and chunk count.
What an event looks like
{
"event_id": "...",
"agent_name": "my_agent",
"ts": 1715630400.123,
"provider": "anthropic",
"surface": "create",
"duration_ms": 842.1,
"request": { "model": "claude-...", "system": "...", "messages": [...], "tools": [...] },
"response": { "content": [ {"type": "thinking", ...}, {"type": "tool_use", ...} ],
"stop_reason": "tool_use",
"usage": { "input_tokens": 512, "output_tokens": 128,
"cache_read_input_tokens": 256,
"cache_creation_input_tokens": 64 } },
"call_site": { "file": "/path/to/your/agent.py", "line": 84, "func": "step" },
"status": "ok"
}
Forensic depth, one line per call:
- request: model, system, full
messages, fulltoolsschema,max_tokens,thinkingconfig, sampling params - response: every content block (
text,thinking,tool_use,tool_result),stop_reason,stop_sequence, messageid, model returned - usage:
input_tokens,output_tokens,cache_read_input_tokens,cache_creation_input_tokens— so you can see what the model actually paid for, not what you hoped it would - call_site: file path + line number + function name of the caller, with
/tessen/and vendor-SDK frames skipped so you land on your code - timing:
duration_mswall-clock - errors: exception type, message, and full traceback if the call raised
- streaming:
streamed: true+chunks_captured+ assembled final message
Where logs live
~/.tessen/logs/{agent_name}/{YYYY-MM-DD}.jsonl. Override with tessen.init(log_dir=...). Files rotate at 100 MB.
python -m tessen.viewer ~/.tessen/logs/my_agent/2026-05-13.jsonl
Overhead
The wrapper is a thin function call around the original SDK method. The write goes through a thread-locked append + fsync(). Real-world overhead is well under 1 ms per call. Tessen does not move with your hot path.
Frameworks
Tessen patches at the resource-class level (anthropic.resources.messages.Messages.create, etc.), so framework wrappers — LangChain ChatAnthropic, LangGraph nodes, deepagents, OpenAI Agents SDK with an Anthropic adapter — all flow through Tessen automatically. That's structural, not opt-in.
Privacy
The SDK runs in your process. There is no network call from Tessen. You bring your own API keys; we never see them. Your code, your data, your machine.
FAQ
Where is my data going? Nowhere. The SDK writes locally to disk. There is no network call to Tessen from the SDK. You bring your own API keys; we never see them.
Does this slow down my agent?
A thin function call around the original SDK method plus a thread-locked append + fsync(). Real-world overhead measures well under 1 ms per call.
What if my framework wraps the SDK?
Tessen patches at the resource-class level, so framework wrappers (LangChain ChatAnthropic, LangGraph nodes, deepagents, OpenAI Agents SDK with an Anthropic adapter) all flow through automatically. You do not write framework-specific code.
What about token usage on streams?
Streams are tee'd. The captured event records the assembled final message including usage when the vendor SDK exposes a get_final_message() or get_final_response() accessor. Otherwise we record the chunks themselves so you can reconstruct.
Where do logs live?
~/.tessen/logs/{agent_name}/{YYYY-MM-DD}.jsonl. Override with tessen.init(log_dir=...). Files rotate at 100 MB.
From capture to control
The SDK above is Act 1 — the harness in place, every call captured. The arc continues:
- Act 2 — the analyzer (coming). Captured JSONL plus your codebase, joined into answers, not dashboards: "your agent silently retried this tool call 4 times this morning, costing $312", "this tool fails 12% of the time and the agent never reports it", "your loop iteration count drifted from 3 on Monday to 47 on Friday." That is the finding an engineer pays for.
- Act 3 — the active harness (coming). PR diffs that fix detected agent fragility. Runtime guards that block runaway loops before the bill lands. Cost caps that fire before the page does.
From capture to control. Tessen is the harness, not the dashboard.
License
MIT. Vendor it, ship it, modify it.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tessen-0.2.0.tar.gz.
File metadata
- Download URL: tessen-0.2.0.tar.gz
- Upload date:
- Size: 38.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8352a4c9fc9781da95f34088f79d0c5a414b27d3079f02dabdb36b7f539b2e0d
|
|
| MD5 |
33ba7c62833f0bc63f8bcee06327a9f0
|
|
| BLAKE2b-256 |
bc025f4cd98316ee5e562790b5e4580c9c17760c30e8ad4bb5629e90929e006a
|
File details
Details for the file tessen-0.2.0-py3-none-any.whl.
File metadata
- Download URL: tessen-0.2.0-py3-none-any.whl
- Upload date:
- Size: 35.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
01d4571521ef94ffdab15f23e055751b789b8dacc75d0e8cdcd6ef2bef866bcc
|
|
| MD5 |
b0a6b94f23096e0b60b1b41a232b95b2
|
|
| BLAKE2b-256 |
ac32b2da5d2302004f5162ff9c15cb956e9ee80e86ba4b776935e038ea32d0c9
|