Skip to main content

Reliability infrastructure for AI agents: Local-first budget & loop enforcement.

Project description

kazenai-core

Stop your AI agents from burning your budget. Catch loops before they catch you.

PyPI version Python 3.10+ License: MIT Status: Building in Public


What is KazenAI?

KazenAI is reliability infrastructure for AI agents.

When you run an AI agent in production, three things will eventually go wrong:

  1. It loops — and burns your entire monthly LLM budget in 40 minutes
  2. It fails silently — returns 200 OK but the business outcome is wrong
  3. You can't debug it — because AI agents are non-deterministic and single-trace debugging is meaningless

KazenAI intercepts every LLM and tool call your agent makes, enforces budget limits locally (no network required), detects loops before they become expensive, and gives you full observability — with one function call.

from kazenai import monitor

agent = monitor(
    agent,
    agent_id="support-agent",
    api_key="kz_...",
    max_budget_usd=5.00,
    debug=True,           # see cost per call in your terminal
)

That's it. Your agent code doesn't change.


The problem in one screenshot

[KazenAI] step=1  gpt-4o-mini  cost=$0.0043  total=$0.0043  proj=$0.21/$5.00  OK
[KazenAI] step=2  gpt-4o-mini  cost=$0.0041  total=$0.0084  proj=$0.19/$5.00  OK
[KazenAI] step=8  gpt-4o-mini  cost=$0.0039  total=$0.43    proj=$4.91/$5.00  WARN:budget_87pct
[KazenAI] step=9  gpt-4o-mini  BLOCKED:budget  spent=$5.00  limit=$5.00

KazenBudgetExceeded: Run agent-001: budget $5.00 exceeded (spent $4.97, attempted $5.42)

No more waking up to a $47K bill.


Features (MVP — shipping Week 1)

  • monitor(agent, ...) — wraps LangChain, AutoGen, and CrewAI agents with one call
  • Local budget enforcement — blocks calls before they're made, no network required, <1ms
  • Real-time debug outputdebug=True prints cost + status after every LLM call
  • Loop detection (H1) — Jaccard similarity catches near-duplicate inputs before they spiral
  • Loop detection (H2) — tool chain fingerprinting catches recursive tool patterns
  • Rate limitingmax_calls_per_minute prevents runaway agents from moving too fast
  • Event samplingsample_rate=0.3 to control backend traffic at scale
  • Offline resilience — RetryQueue (SQLite) stores events locally if backend is down
  • Parent-child tracingRunContext with parent_step_id for multi-agent pipelines
  • Canonical event schemaKazenEvent shared by SDK and backend (no field drift)
  • OpenAI patcher — transparent monkey-patch, no code changes required
  • LangChain integrationKazenCallbackHandler + LangChainProxy
  • AutoGen integration — wraps initiate_chat
  • CrewAI integration — wraps kickoff()

Planned Modules (Roadmap)

  • AgentLens P1 — step-level trace capture dashboard (Month 2)
  • AgentLens P2 — Probabilistic Replay Engine: run any trace N times, get statistical distribution of outcomes (Month 5) — no competitor has built this
  • AgentLens P3 — Semantic Drift Monitor: detect when your agent's behaviour changes after a model update (Month 6)
  • TypeScript SDK — for Node.js agent frameworks (Month 6)

Installation

pip install kazenai-finops

Python 3.10, 3.11, 3.12 supported. No C extensions. Installs in under 30 seconds.


Quick Start

Raw OpenAI (no framework)

import openai
from kazenai import monitor, KazenBudgetExceeded

client = openai.OpenAI()

# monitor() patches the OpenAI client transparently
with_monitoring = monitor(
    client,
    agent_id="my-agent",
    api_key="kz_...",          # get yours at kazenai.com
    max_budget_usd=0.50,
    debug=True,
)

try:
    for i in range(100):       # simulated loop
        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "user", "content": f"Step {i}: do the thing"}],
        )
except KazenBudgetExceeded as e:
    print(f"Blocked at step {e.run_id}: spent ${e.spent:.4f}")

LangChain

from kazenai import monitor

chain = your_langchain_chain  # LCEL chain, agent, etc.
monitored = monitor(
    chain,
    agent_id="customer-support",
    api_key="kz_...",
    max_budget_usd=2.00,
    debug=True,
)

result = monitored.invoke({"input": "help me with my order"})

CrewAI

from kazenai import monitor

crew = YourCrew()
monitored = monitor(
    crew,
    agent_id="research-crew",
    api_key="kz_...",
    max_budget_usd=10.00,
    h2_max_reps=3,   # block if same tool chain repeats 3 times
)

result = monitored.kickoff(inputs={"topic": "AI trends"})

Why local enforcement matters

Most observability tools record what happened. KazenAI blocks what's about to happen.

Traditional tools:  LLM call → response → log cost → dashboard shows $47K
KazenAI:            Pre-flight check → BLOCKED → LLM call never made

Local enforcement means:

  • No network dependency — works with backend_url=None
  • No latency added — budget check completes in <1ms
  • No backend outage = no protection failure — the agent doesn't need to reach our servers to be protected

Repository Structure

kazenai-core/
├── kazenai/
│   ├── __init__.py          # public API: monitor()
│   ├── schema.py            # canonical KazenEvent (shared SDK + backend)
│   ├── context.py           # RunContext with parent_step_id
│   ├── monitor.py           # monitor() entry point
│   ├── interceptor.py       # LLM/tool call interception
│   ├── enforcement.py       # local budget + rate limit enforcement
│   ├── loop_detector.py     # H1 (Jaccard) + H2 (chain fingerprint)
│   ├── cost_tracker.py      # 14-model pricing table
│   ├── client.py            # async API client + sampling
│   ├── retry_queue.py       # SQLite retry queue for offline resilience
│   ├── debug.py             # debug=True terminal output
│   ├── config.py            # pydantic-settings
│   └── integrations/
│       ├── langchain.py
│       ├── autogen.py
│       ├── crewai.py
│       └── generic.py
├── examples/
│   ├── basic_agent.py       # raw OpenAI example
│   ├── langchain_example.py
│   ├── loop_example.py      # trigger loop detection
│   └── benchmark_latency.py # verify <5ms overhead
├── tests/
├── pyproject.toml
└── README.md

Design principles

  1. Local-first enforcement — SDK must block without backend
  2. Zero-blocking — SDK overhead <5ms on hot path
  3. Fail-open always — internal errors never crash your agent
  4. Canonical schema — KazenEvent used by both SDK and backend (extra='forbid')
  5. DX over featuresdebug=True gives value in 2 minutes

Star this repo ⭐

If you've ever woken up to an unexpected LLM bill, or spent hours debugging an agent that returned 200 OK but did nothing useful — star this repo. It tells us this matters to you, and it helps us ship faster.

We're building in public. Follow @kazenai for weekly progress updates.


Status

Week 1 — Building core SDK
Week 2 — Design partner onboarding
Week 3 — Hosted dashboard + paid tiers
Week 4 — Public launch

Early access: kazenai.com or email founder@kazenai.com


Contributing

We're pre-1.0 and moving fast. The best way to contribute right now is:

  1. ⭐ Star the repo
  2. Open an issue describing a pain point you've hit with AI agent costs or loops
  3. Try the examples and report what breaks

Full contribution guide coming with v1.0.


License

Apache 2.0 — use it for anything, attribution appreciated.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kazenai-0.1.0.tar.gz (9.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kazenai-0.1.0-py3-none-any.whl (9.1 kB view details)

Uploaded Python 3

File details

Details for the file kazenai-0.1.0.tar.gz.

File metadata

  • Download URL: kazenai-0.1.0.tar.gz
  • Upload date:
  • Size: 9.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for kazenai-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c40164fc8799f2d1cfbf0bc71ca1c853654f661fb7a153440e135825c35781fd
MD5 2dc12e764f3f9350971f6c53c80458fd
BLAKE2b-256 70c0099416bb0c626117a7fb9dc8314fd6c2d90a799cb1d6c8d22dd4e869699c

See more details on using hashes here.

File details

Details for the file kazenai-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: kazenai-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 9.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for kazenai-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5ed88aa3fe0198ea92c8198e4bed4d3a2f8bbc4933f0071233d0504bcbf1aa20
MD5 a1219d9a652829903fd25ea4f5989c57
BLAKE2b-256 e910df9730adf61f095c179139c69a75d80fd0e54bee109b082fe257774fee0a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page