The harness for your AI agents. One line to install. Captures every model call, maps your codebase, generates frame-aware PR fixes — without your source ever leaving your machine.
Project description
tessen
The harness for your AI agents. One line to install. Captures every model call, every tool use, every retry. Maps your codebase. Learns your team's commit style. Generates frame-aware PR fixes — without your source ever leaving your machine.
Tame the beast in your Agentic Workflows. Use Tessen.
import tessen
tessen.init()
That's the entire integration. Drop it before you construct your LLM client.
Why
It's 3am. You're paged. Your agent burned $2,000 in API spend in two hours, looping on the same broken tool 47 times because the SDK swallowed a 502 and the agent silently retried. Your dashboard says the request succeeded. Your traces show one span. Your logs are noise.
You cannot defend what you cannot see — and the tools that watch your agent treat each model call like a web request instead of like a program that thinks. Tessen treats it like a program. Every thinking block, every tool call, every cache decision, every retry — recorded structurally, in your process, on your disk. When the page comes in, you have the receipts.
Install
pip install tessen
# or, with the optional log viewer
pip install "tessen[viewer]"
Core install has zero hard dependencies. Tessen patches the vendor SDKs that are already importable in your process — Anthropic, OpenAI, Google Gemini, Cohere, Mistral — without forcing any of them on you.
Overhead
Well under 1 ms per captured call — measured. Tessen sits between your code and the vendor's network round-trip (which is hundreds of milliseconds to several seconds), so the overhead is invisible. Run tessen bench to verify on your machine. The regression test fails the build if mean per-event overhead crosses 1 ms.
Typical numbers (Apple Silicon, APFS): ~75 µs mean, ~105 µs p99. The dominant cost is fsync() on disk write — overridden via TESSEN_DISABLE_FSYNC=1 if you want lower latency at the cost of crash durability.
What runs locally, free, today
Capture (always on)
import tessen
tessen.init()
Every model call from your agent — full request, full response with thinking blocks decoded, cache-token usage, call-site, timing — written as JSONL to ~/.tessen/logs/<agent>/<date>.jsonl. Trunk-patched across all 5 vendors: structurally vendor-rename-resistant.
See what just happened
tessen status # which agents are active, errors, anomalies — at a glance
tessen tail my_agent # `tail -f` for live events
tessen viewer my_agent # per-event forensic detail
Per-event inspector. 🚨 markers on anomalies (silent retries, truncated responses, refused requests, cache-misses, malformed tool calls). HTTP status codes, vendor error types, request/response content blocks. The forensic record.
Aggregate findings
tessen analyze my_agent
Rolls up your captured events into a leadership-readable report:
## findings
🚨 12 ERROR event(s)
• 8× RateLimitError (429)
• 4× NotFoundError (404)
⚠ 47 event(s) flagged with anomalies:
• 34× cache_control_set_but_no_cache_activity
• 12× truncated_response
• 1× thinking_only_no_output
## cost
total: $312.40 last 7 days
claude-sonnet-4-6: $280.10
claude-haiku-4-5: $32.30
## latency (ms): p50=842 p95=4012 p99=14290
Map your codebase
tessen crawl /path/to/your/repo
Deterministic AST scan. Finds every call site to every supported vendor SDK, classifies the surrounding function via named structural rules (tool_use_loop_anthropic, framework_wrapped:langgraph, etc.), extracts per-function docstrings, comments, body hashes, and in-file callers. Stdlib-only. Sub-second on most repos.
What activates with TESSEN_API_KEY
When you set TESSEN_API_KEY in your environment, tessen.init() auto-activates:
export TESSEN_API_KEY=tk_live_...
tessen.init() # same one line — but now ships to the hosted backend
What ships:
- Events — your captured runtime stream
- Manifest — your codebase map (the output of
tessen.crawl) - Git style fingerprint — your team's commit/test/style conventions, author emails SHA-256 hashed, no commit messages or diffs leave your machine
What never ships:
- Your source code
- Your commit messages
- Your diffs
- Your author names or emails
The hosted backend uses these signals to generate frame-aware PRs that fix detected agent fragility — matching your team's commit style, test discipline, library preferences, and file co-change patterns. PR specs come back via tessen.apply, which reads your local source and produces the actual diff:
tessen apply spec.json --open-pr
Your source never leaves your machine. The hosted backend works from the manifest + events + style fingerprint; the local applier produces the diff. That's the unfakeable privacy claim.
What gets captured per event
Every event carries this base shape — verified by regression test against the actual captured JSONL:
{
"event_id": "...",
"session_id": "...",
"agent_name": "...",
"ts": 1715688000.594, // unix timestamp (seconds, float)
"_tessen_tier": 1, // 1=trunk, 2=discovery, 3=http-floor
"provider": "anthropic",
"surface": "messages.create",
"call_type": "request",
"http_method": "POST",
"http_url": "https://api.anthropic.com/v1/messages",
"call_site": {"file": "agent.py", "line": 101, "func": "run"},
"duration_ms": 842.1,
"request_body": { /* full SDK-serialized JSON */ },
"streamed": false,
"status": "ok", // "ok" | "error"
"quality_signals": ["rate_limited"] // anomaly tags surfaced at write time
}
The response shape depends on what happened:
| event variant | response fields present |
|---|---|
| Successful non-streaming call | casted_response (pydantic dump with parsed content blocks), response_body (raw JSON), response_headers (request-id, rate-limit, x-should-retry) |
| Successful streaming call | response_body (reassembled from chunks), chunks_captured (count) |
| Error (HTTP or client-side) | error (object with type, msg, tb, status_code, body, response_headers) |
So a customer code path that always wants the structured response should fall back: event.get("casted_response") or event.get("response_body") or event.get("error", {}).get("body"). The tessen viewer and tessen analyze CLIs do this for you.
The trunk-patching architecture means:
- Anthropic SDK rename? Captured anyway.
- OpenAI changes their resource paths? Captured anyway.
- Vendor adds a new method we haven't patched? Captured anyway.
Only a base-client rename breaks Tier 1 — at which point Tier 3 (HTTP-layer floor) keeps capturing every model call regardless. Three-layer architecture, structurally hardened.
Anomaly tags
Tessen tags events at write-time with quality_signals so the analyzer can lead with what's actionable. Filter on any tag via tessen.events.filter_events(events, quality_signal="...") or tessen analyze's findings section.
Content-level (Anthropic / OpenAI / Gemini)
empty_content_on_end_turn— model finished withstop_reason: end_turnbut returned no content. Common silent-failure mode.empty_content_on_stop— OpenAI / Mistral equivalent (finish_reason: stopwith emptymessage.content).truncated_response—stop_reason: max_tokens/length. The customer'smax_tokenswas hit; agent saw a cut-off response.thinking_only_no_output— Anthropic extended-thinking event with thinking blocks but zero text/tool_use output.refusal— model returned arefusalblock (Anthropic / OpenAI policy decline).invalid_tool_use_json— Anthropictool_use.inputwas malformed JSON in the streaming reassembly.invalid_tool_call_json— OpenAItool_calls[].function.argumentswas malformed JSON.
Cache-control (Anthropic prompt caching)
cache_control_set_but_no_cache_activity— request setcache_control: ephemeralbut response showed zerocache_read_input_tokens/cache_creation_input_tokens. Prompt likely below the 1024-token minimum, OR cache TTL expired.
HTTP-error categorization (every vendor)
rate_limited(429),overloaded(529),auth_error(401),permission_denied(403),not_found(404),conflict(409),unprocessable(422),invalid_request(400),connection_error(network failure pre-status),server_error(5xx other),client_error(4xx other).
19 tags total, evaluated in order: HTTP errors first (since they're most product-relevant), then content-level anomalies, then cache. Empty quality_signals array means: clean event.
Configuration
Tessen reads everything from env vars. The Python API stays one line.
| variable | default | purpose |
|---|---|---|
TESSEN_API_KEY |
unset → local-only | Activate hosted streaming when set |
TESSEN_AGENT_NAME |
inferred from module | Override agent identity |
TESSEN_LOG_DIR |
~/.tessen/logs |
Where local JSONL lands |
TESSEN_INGEST_URL |
https://api.tessen.dev/v1/ingest |
Override for self-hosted |
TESSEN_DISABLE_CRAWL |
unset | =1 opts out of auto-crawl |
TESSEN_DISABLE_GIT_LEARN |
unset | =1 opts out of auto git fingerprint |
TESSEN_LEAF_FALLBACK |
unset | =1 reverts to legacy leaf-patching (compat escape hatch) |
TESSEN_DISABLE_FSYNC |
unset → fsync after every write | =1 skips fsync() per event — lower latency, but loses up to one buffered batch on power loss |
Shipper tuning (only matters when TESSEN_API_KEY is set):
| variable | default | purpose |
|---|---|---|
TESSEN_FLUSH_INTERVAL_SEC |
5.0 |
Max seconds to wait before flushing a partial batch |
TESSEN_BATCH_MAX_EVENTS |
64 |
Max events per POST |
TESSEN_QUEUE_MAX |
1024 |
Bounded queue size; drop-oldest on overflow |
TESSEN_RETRY_MAX |
3 |
Retries per batch with exponential backoff |
TESSEN_HTTP_TIMEOUT_SEC |
10.0 |
Per-request HTTP timeout |
Robustness: every TESSEN_* env var is whitespace-tolerant. Boolean flags accept 1 / true / yes / on case-insensitively; numeric tuning vars fall back to defaults for unparseable input. A typo or accidental $UNSET_VAR expansion never crashes import tessen.
Power-user overrides exist as kwargs on tessen.init(). The 99% case is one function call, no args.
Vendor coverage
| vendor | sync | async | streaming (ctx-mgr) | streaming (iterator) | tool-use |
|---|---|---|---|---|---|
| Anthropic | ✅ | ✅ | ✅ | ✅ | ✅ |
| OpenAI (chat.completions) | ✅ | ✅ | ✅ | ✅ | ✅ |
| OpenAI (responses API) | ✅ | ✅ | ✅ | ✅ | ✅ |
Google Gemini (google.genai) |
✅ | ✅ | ✅ | ✅ | ✅ |
| Cohere | ✅ | ✅ | ✅ | ✅ | ✅ |
| Mistral | ✅ | ✅ | ✅ | ✅ | ✅ |
Framework wrappers (LangChain ChatAnthropic, LangGraph nodes, deepagents, OpenAI Agents SDK) all flow through trunk-patching automatically — no per-framework code in tessen.
CLIs at a glance
After pip install tessen, the tessen command is on your PATH. Every subcommand also works as python -m tessen.<name> if you prefer.
tessen status # glanceable overview of every captured agent
tessen tail <agent> # `tail -f` for live events (`-n 5` for backfill)
tessen viewer <agent> # per-event forensic detail with anomaly markers
tessen analyze <agent> # aggregate findings — errors, anomalies, cost, latency
tessen compare a b # side-by-side diff of two events or sessions
tessen upload <agent> # manually ship local JSONL to the hosted backend
tessen crawl <repo> # deterministic AST codebase map
tessen repo-learn <repo> # privacy-preserving git style fingerprint
tessen apply <spec.json> # apply a PR spec from the hosted backend
tessen doctor # diagnose your integration (env vars, vendor SDKs)
tessen doctor --ping # also probe TESSEN_INGEST_URL reachability
tessen bench # measure per-event capture overhead
tessen version # print version
tessen status, tessen tail, tessen viewer, and tessen analyze all accept a bare agent name and resolve it under $TESSEN_LOG_DIR (default ~/.tessen/logs).
Programmatic Python API
For when the CLI isn't the right surface — building your own dashboard, integrating into a notebook, or running tessen's analysis as a step in your own pipeline:
import tessen.events as ev
# Iterate every captured event for an agent
for event in ev.read("my_agent"):
print(event["event_id"], event["status"], event.get("quality_signals"))
# Get the last N events (handy in notebooks)
last_50 = ev.recent("my_agent", n=50)
# Filter — composable, all kwargs optional
errors_only = ev.filter_events(last_50, status="error")
rate_limited = ev.filter_events(last_50, quality_signal="rate_limited")
in_session = ev.filter_events(last_50, session_id="sess_abc")
# Aggregate — same rollup shape as `tessen analyze --json`. Carries the
# `schema_version` / `tessen_version` / `generated_at` envelope so
# scripts can branch on version changes and cache by timestamp.
report = ev.aggregate(last_50)
print(report["total_cost_usd"], report["error_events"], report["anomaly_counts"])
print(report["schema_version"]) # 1 for tessen 0.3.x
# Time-bounded reads — accepts unix float, tz-aware datetime, timedelta,
# or a duration string like "24h"/"7d"/"2w" (same shorthand as `tessen
# tail --since`).
from datetime import timedelta
recent_24h = list(ev.read("my_agent", since=timedelta(hours=24)))
recent_24h = list(ev.read("my_agent", since="24h")) # equivalent
# Diagnostic: opt-in stats dict captures corrupt-line counts so you know
# if a writer crash left torn JSONL.
stats: dict = {}
events = list(ev.read("my_agent", stats=stats))
if stats.get("corrupt_lines"):
print(f"⚠ {stats['corrupt_lines']} torn lines skipped")
# Discover agents (defaults to only ones with captures; include_empty=True
# for raw filesystem truth)
agents = ev.list_agents()
All these functions also accept log_dir= to override the default location.
License
MIT. The SDK is open-source. The hosted backend (analyzer + PR generation) is a separate commercial service — see tessen.dev.
Contact
hi@tessen.dev · tessen.dev
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tessen-0.3.0.tar.gz.
File metadata
- Download URL: tessen-0.3.0.tar.gz
- Upload date:
- Size: 200.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ef6a700aa86bf039bcda189c9d79de0645f99f75a535583a3e4d62f5a4ba2952
|
|
| MD5 |
41ac3aa423a29f365ec93a6f316b0ca5
|
|
| BLAKE2b-256 |
ad3ae348210d38eaff87b6782a45c2d3bf8ff333c2e69ac714e6b0e7d832cdb7
|
File details
Details for the file tessen-0.3.0-py3-none-any.whl.
File metadata
- Download URL: tessen-0.3.0-py3-none-any.whl
- Upload date:
- Size: 123.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f6bf7230879e9be5e5d45c4ba7b075ffc501f584a15cd79230228f1f81f149fa
|
|
| MD5 |
2431ae261e78b2199c550ed931d8adc7
|
|
| BLAKE2b-256 |
2331101d1179f4d708c6f8a41aeb94c99d202a3f40243bcf409b0fc228b953f2
|