Predictive token-waste detection for AI agents

These details have not been verified by PyPI

Project links

Project description

token-sentinel

Predictive token-waste detection for AI agents — a Python SDK that runs in-process, watches LLM calls after each provider response, and fires a typed callback so you can log, alert, or hard-stop the next turn.

Observability tools show the bill after the fact. TokenSentinel names the waste pattern while the session is still running. Detection is post-call (the turn that just finished is already billed); intervention saves subsequent turns.

Docs: https://docs.tokensentinel.dev

Install

pip install token-sentinel[anthropic]   # or [openai], [gemini], …
# optional: local token estimates when usage is missing
pip install token-sentinel[tiktoken]

Quick start

from token_sentinel import Sentinel
import anthropic

sentinel = Sentinel(project="my-agent", mode="log")  # log | alert | block

@sentinel.on_waste  # or @sentinel.on_leak — same callback
def handle(event):
    print(f"{event.type} conf={event.confidence:.2f} burn≈${event.estimated_burn:.4f}")

client = sentinel.wrap(anthropic.Anthropic())
client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=100,
    messages=[{"role": "user", "content": "Hello"}],
)

Tune rule thresholds

Every rule exposes flat config keys "rule_name.param". Example — require five similar tool calls (default is 3) before tool_loop fires:

sentinel = Sentinel(
    project="my-agent",
    mode="log",
    config={
        "tool_loop.min_calls": 5,
        "tool_loop.cosine_threshold": 0.80,
        "tool_loop.window_seconds": 120,
    },
)

Full parameter tables: docs/user/04-waste-rules.md.

What it catches (15 rules)

Rule	Signal
tool_loop	Same tool, ≥N cosine-similar calls in a window
context_bloat	Prompt-tokens-per-turn slope rising
embedding_waste	Same embedding lookup repeated in session
zombie	Calls continue with no user-facing output
model_misroute	Classification-shaped prompt on a frontier model
retry_storm	Same call retried many times unchanged
tool_definition_bloat	Huge tool JSON / MCP tool lists
retrieval_thrash	Overlapping retrieval queries
vision_re_upload	Same image re-uploaded across turns
vision_high_detail_misroute	High-detail flag on low-detail work
vision_cost_concentration	Vision spend concentrated in few sessions
audio_multichannel_doubling	Multichannel STT billing trap
voice_switching_loop	Same text, many voice IDs
rerank_thrash	Duplicate Cohere rerank requests
repair_loop	Correction churn + similar regenerations

Providers

Native wrappers (pip install token-sentinel[<extra>]):

Provider	Extra	Streaming
Anthropic	`anthropic`	yes
OpenAI (+ Whisper)	`openai`	yes
Google Gemini / Vertex	`gemini`	yes
AWS Bedrock	`bedrock`	yes
Voyage, Cohere, Replicate, Deepgram, ElevenLabs	matching extra	varies

OpenAI-compatible hosts (same wrapper, set base_url): DeepSeek, Together, Fireworks, Groq, OpenRouter, Mistral, Perplexity, vLLM, Ollama, TGI, LM Studio, xAI Grok (https://api.x.ai/v1), etc.

import openai
client = sentinel.wrap(
    openai.OpenAI(api_key="…", base_url="https://api.deepseek.com")
)

Pass a stable _sentinel_session_id=... on each call (or use Sentinel.session(...)) so multi-turn rules see history.

Modes

Mode	Behavior
`log`	Emit events to your handler. Default. Safe for prod day one.
`alert`	Same local behavior as `log` (mode stamp for optional cloud).
`block`	Raise `LeakDetected` / `WasteDetected` after the provider returns.

Cost estimates (`estimated_burn`)

Burn figures are approximate FinOps signals, not invoices:

Model-aware rates for major Anthropic / OpenAI / Gemini / DeepSeek / Cohere / Mistral families (input vs output priced separately).
Prompt-cache reads (OpenAI cached_tokens, Anthropic cache_read_input_tokens) discounted when present on the usage payload.
Unknown models fall back to a flat average (~$9e-6 / token).
Optional tiktoken fills missing token counts when usage was omitted.

Override rates with Sentinel(pricing_table={...}) or from token_sentinel import estimate_usd, ModelRate, default_pricing_table.

Frameworks

Instruments at the LLM-client layer, so MCP hosts, RAG pipelines, LangChain / LangGraph / CrewAI / AutoGen / Pydantic AI work when traffic bottoms out in a wrapped client. Enrichers: pip install token-sentinel[langchain] or [otel].

Optional cloud

Nothing phones home unless you pass both cloud_endpoint= and api_key= on Sentinel(...). For hosted dashboard, Composite Signals and policy features — see tokensentinel.dev. This package is Apache-2.0 and fully usable offline.

Status

1.0.3 — 15 rules, model-aware burn estimates, cache-aware usage, optional tiktoken fallback, OSS-focused docs.

Public API (Sentinel, wrap, on_leak / on_waste, record_call, session, mark_long_running, events/exceptions, pricing helpers) follows semver — pin deliberately (token-sentinel>=1.0,<2).

More docs

User guide — install, modes, rules, providers, API reference
Architecture · Waste taxonomy
CHANGELOG

Contact

support@tokensentinel.dev · hello@tokensentinel.dev · tokensentinel.dev

License

Apache-2.0 — see LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.0.3

Jul 16, 2026

1.0.2

Jul 13, 2026

1.0.1

Jul 12, 2026

1.0.0

Jun 11, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

token_sentinel-1.0.3.tar.gz (420.6 kB view details)

Uploaded Jul 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

token_sentinel-1.0.3-py3-none-any.whl (206.0 kB view details)

Uploaded Jul 16, 2026 Python 3

File details

Details for the file token_sentinel-1.0.3.tar.gz.

File metadata

Download URL: token_sentinel-1.0.3.tar.gz
Upload date: Jul 16, 2026
Size: 420.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for token_sentinel-1.0.3.tar.gz
Algorithm	Hash digest
SHA256	`b375a208e2bd351eb7df6938ce1913fc9e476b23620ca1d682718b8c64021271`
MD5	`116be4333a02f5041a25c45da3879efc`
BLAKE2b-256	`096321075bdc2f605aac224d7bf918a60e794c5441704b51a37f2f432a4fa13a`

See more details on using hashes here.

File details

Details for the file token_sentinel-1.0.3-py3-none-any.whl.

File metadata

Download URL: token_sentinel-1.0.3-py3-none-any.whl
Upload date: Jul 16, 2026
Size: 206.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for token_sentinel-1.0.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b39b378f13e248dc52eac922e70a85a63bc8d13ae5b6be0d431a9159cc4a9b65`
MD5	`e3303c9c59acbc9d85a3fe66a07bb8a2`
BLAKE2b-256	`6e691544cba633cd92deba8a7d9f18d3b1f41721c940d7a850f2e4599039a95e`

See more details on using hashes here.

token-sentinel 1.0.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

token-sentinel

Install

Quick start

Tune rule thresholds

What it catches (15 rules)

Providers

Modes

Cost estimates (`estimated_burn`)

Frameworks

Optional cloud

Status

More docs

Contact

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

token-sentinel 1.0.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

token-sentinel

Install

Quick start

Tune rule thresholds

What it catches (15 rules)

Providers

Modes

Cost estimates (estimated_burn)

Frameworks

Optional cloud

Status

More docs

Contact

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Cost estimates (`estimated_burn`)