Skip to main content

Behavioral drift monitoring for AI agents.

Project description

Driftbase

Behavioral drift detection for AI agents.

Fingerprint your agent's behavior in production. Diff two versions. Get a statistically grounded drift score and plain-English verdict — before your users notice something changed.

PyPI version License: Apache 2.0 Python 3.9+


What it does

Most observability tools tell you your agent ran. They don't tell you it behaved differently than last week.

Driftbase records behavioral signals from each run — tool call sequences, latency percentiles, decision outcomes, error patterns — and computes a drift score when you compare two versions. Every deploy gets a verdict: ship, monitor, review, or block.

  • One decorator, zero config — wraps your existing agent entry point
  • No account, no API key, no signup — works out of the box
  • Data never leaves your machine — local SQLite, always
  • Statistically rigorous — bootstrap confidence intervals on every drift score
  • Plain-English verdicts — not just a number, tells you what changed and what to do

Quickstart

1. Install

pip install driftbase

2. Wrap your agent

from driftbase import track

@track(version="v1.0")
def run_agent(input, config=None):
    return agent.invoke(input, config=config)

3. Run your agent normally

Generate at least 50 runs on each version you want to compare.

# Check how many runs you have
driftbase runs list

4. Deploy a new version

@track(version="v2.0")
def run_agent(input, config=None):
    return agent.invoke(input, config=config)

5. Diff

driftbase diff v1.0 v2.0

6. Read your verdict

────────────────────────────────────────────────
  DRIFTBASE  v1.0 → v2.0  ·  127 vs 89 runs
────────────────────────────────────────────────

  drift_score      0.29  [0.24 – 0.34, 95% CI]

  decision_drift   0.47  ⚠ HIGH
  └─ escalation rate: 4% → 11%
  └─ agent is routing 2.1× more to humans

  latency_drift    0.17  · MODERATE
  └─ p95: 243ms → 423ms

  tool_dist        0.12  · LOW
  error_drift      0.04  ✓ STABLE

────────────────────────────────────────────────
  VERDICT  ⚠  REVIEW BEFORE SHIPPING
────────────────────────────────────────────────
  Most likely cause:
  → Decision layer changed — escalation rate
    jumped from 4% to 11%.

  Next steps:
  □ Review system prompt changes between versions
  □ Check outcome routing and decision tree logic

────────────────────────────────────────────────
  ✓ Computed in 120ms · No data left your machine

How it works

Driftbase captures four behavioral signals per run:

Signal What it measures
Tool distribution Which tools were called and how often
Latency profile Response time percentiles (p50, p95, p99)
Decision outcomes Escalation rate, resolution rate, fallback rate
Error patterns Error rate and error type distribution

After 50 runs, a behavioral fingerprint is computed. When you diff two versions, Driftbase computes Jensen-Shannon divergence between fingerprints across all four dimensions, runs 500 bootstrap iterations for confidence intervals, and produces a verdict.

Nothing else is captured. Raw inputs, outputs, prompts, and user content are never stored or transmitted.


Privacy

What Status
Tool call names and sequences ✓ Captured locally
Latency measurements ✓ Captured locally
Decision outcome labels ✓ Captured locally
Error type identifiers ✓ Captured locally
SHA-256 input hash (deduplication only) ✓ Captured locally
Raw user inputs or prompts ✗ Never captured
Model outputs or completions ✗ Never captured
Conversation content ✗ Never captured
Any data transmitted to Driftbase ✗ Free tier: never

All data is stored in ~/.driftbase/runs.db. Nothing leaves your machine in the free tier.


Frameworks supported

Framework Integration
LangChain @track() decorator on executor invocation
LangGraph @track() decorator on compiled graph invoke
AutoGen @track() decorator on conversation initiator
CrewAI @track() decorator on crew kickoff
Raw OpenAI / Anthropic @track() decorator on API call function

See Framework integrations for copy-paste examples.


CLI reference

Command Description Example
diff Compare two versions driftbase diff v1.0 v2.0
runs list List versions and run counts driftbase runs list
runs inspect Show detailed stats for a version driftbase runs inspect v1.0
runs delete Delete runs for a version driftbase runs delete --version v1.0
report Generate drift report driftbase report v1.0 v2.0 --format html
export Export raw run data driftbase export --version v1.0 --format json
config show Show current config and DB path driftbase config show

Decorator reference

@track(
    version="v1.0",          # Required. Version identifier for this run.
    agent_id="my-agent",     # Optional. Defaults to function name.
    tags={"env": "prod"},    # Optional. Arbitrary metadata per run.
    outcomes=["resolved",    # Optional. Enables decision rate monitoring.
              "escalated"]
)
def run_agent(input, config=None):
    return agent.invoke(input, config=config)

Async functions are supported:

@track(version="v1.0")
async def run_agent(input, config=None):
    return await agent.ainvoke(input, config=config)

Interpreting drift scores

Score Verdict Action
0.00 – 0.10 ✓ STABLE Ship
0.10 – 0.20 · MONITOR Ship with monitoring
0.20 – 0.40 ⚠ REVIEW Investigate before shipping
0.40+ ✗ BLOCK Do not ship without review

See What to do about drift for dimension-specific playbooks.


EU compliance

Driftbase is designed for European teams with compliance obligations:

  • GDPR by architecture — no personal data captured, no DPA required for free tier
  • EU AI Act Articles 9 & 12 — behavioral audit trail generated automatically
  • DORA — timestamped change documentation per deploy
  • No US servers — free tier runs entirely on your machine

See driftbase.io/docs/gdpr-compliance for details.


Installation options

# Core (recommended)
pip install driftbase

# With semantic root-cause analysis (local embeddings, no API calls)
pip install driftbase[semantic]

Requires Python 3.9+. Works on macOS, Linux, and Windows (WSL recommended).


Links


License

Apache License 2.0. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

driftbase-0.1.5.tar.gz (61.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

driftbase-0.1.5-py3-none-any.whl (66.7 kB view details)

Uploaded Python 3

File details

Details for the file driftbase-0.1.5.tar.gz.

File metadata

  • Download URL: driftbase-0.1.5.tar.gz
  • Upload date:
  • Size: 61.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for driftbase-0.1.5.tar.gz
Algorithm Hash digest
SHA256 46d4166a750700c1ef7d9346dc326cb099db7353594604754c63f0b2fca87c75
MD5 6bd518e816b5cf7ae657e70e2c0a6166
BLAKE2b-256 e24d03adbcf5d984d7af18c3bd873727edb2213a784105a23a5e9734c181ea43

See more details on using hashes here.

Provenance

The following attestation bundles were made for driftbase-0.1.5.tar.gz:

Publisher: release.yml on driftbase-labs/driftbase-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file driftbase-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: driftbase-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 66.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for driftbase-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 6da9cf72bfb3dfc504475d633e0352e461b11207a1a0939366d7de7a57a645c8
MD5 f05789add6aeac20293d30beba084645
BLAKE2b-256 1c2db8a85982a247cf24831b659a87a09a1f918018887d6cd921469dfd2940f4

See more details on using hashes here.

Provenance

The following attestation bundles were made for driftbase-0.1.5-py3-none-any.whl:

Publisher: release.yml on driftbase-labs/driftbase-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page