Behavioral drift detection for AI agents. Prevent regressions with automated drift checks in CI/CD.
Project description
Driftbase
Behavioral drift detection for AI agents using your Langfuse traces.
AI agents drift. A prompt update, a model swap, a RAG reindex — any of these can shift how your agent makes decisions, without triggering a single test failure.
Driftbase tells you when your agent changed, what caused it, and whether it got better or worse — by analyzing the traces you're already collecting in Langfuse.
pip install driftbase
Connect your Langfuse instance:
export LANGFUSE_PUBLIC_KEY=pk-lf-...
export LANGFUSE_SECRET_KEY=sk-lf-...
driftbase connect
Then, when something feels off:
driftbase diagnose
DRIFTBASE DIAGNOSTIC
Behavioral shift detected 11 days ago (2026-03-20)
Most likely cause: prompt change in release v2.1
Affected: escalation rate 4% → 19%, latency +1.2s
Recommendation: REVIEW before production deploy
No agent code changes. No instrumentation. Just instant answers from your existing Langfuse data.
How it works
Driftbase is a drift detection layer on top of Langfuse. You already trace your agent with Langfuse — Driftbase reads those traces and detects behavioral drift.
1. You're already tracing with Langfuse
Your AI agent is instrumented with Langfuse (via LangChain, LangGraph, OpenAI, or any other framework). Traces flow into Langfuse automatically.
2. Connect Driftbase to Langfuse
Driftbase pulls historical traces from Langfuse and stores them locally for analysis:
driftbase connect
This imports your traces into a local SQLite database (~/.driftbase/runs.db). All analysis runs on your machine. No data leaves your environment.
3. Detect drift
When something feels wrong:
driftbase diagnose
Scans your full trace history, detects behavioral shifts, and correlates them with version changes.
Compare explicit versions:
driftbase diff v1.0 v2.0
Produces a statistical drift score and a deployment verdict (SHIP / MONITOR / REVIEW / BLOCK).
View behavioral history:
driftbase history
Shows how your agent's behavior evolved over time — which epochs were stable, which shifted, and what changed at each breakpoint.
Core Value Proposition
| What You Get | Why It Matters |
|---|---|
| 60-second wow moment | Run driftbase demo --offline to see drift detection on synthetic data with zero dependencies |
| Zero cold start | Start detecting drift from day 1 using your existing Langfuse traces — no SDK to add, no baseline to collect |
| GitHub Action integration | Automatic drift checks on every PR with rich, color-coded reports posted as comments |
| Self-calibrating drift scores | Weights and thresholds learn from your labeled deployments — the more you use it, the better it gets |
| Root cause pinpointing | Correlates drift with version changes and surfaces the most likely cause with confidence level |
| 100% local-first | All data stays on your machine in SQLite — no cloud required, GDPR-compliant by design |
| Framework-agnostic | Works with any framework already traced in Langfuse or LangSmith — LangChain, OpenAI, CrewAI, custom agents |
| Progressive confidence | Starts working with just 15 runs, full statistical power at 50+ runs per version |
60-Second Demo (No Dependencies)
Want to see drift detection in action before connecting your own traces?
pip install driftbase
driftbase demo --offline
This generates synthetic agent runs showing realistic behavioral drift scenarios and walks you through the core commands. 100% offline, zero external dependencies.
The 5-Minute Quickstart
1. Install
pip install driftbase
2. Set Langfuse credentials
export LANGFUSE_PUBLIC_KEY=pk-lf-...
export LANGFUSE_SECRET_KEY=sk-lf-...
export LANGFUSE_HOST=https://cloud.langfuse.com # optional
Get your keys from Langfuse Settings → API Keys.
3. Import traces
# Auto-detect and import
driftbase connect
# Or specify project explicitly
driftbase connect langfuse --project my-agent --limit 1000
4. Detect drift
# Automatic drift detection
driftbase diagnose
# Compare specific versions
driftbase diff v1.0 v2.0
# View behavioral history
driftbase history
That's it. You're detecting drift in 5 minutes using traces you already have.
See examples/langfuse-quickstart for a complete walkthrough.
What Driftbase analyzes
Driftbase computes drift across 12 behavioral dimensions:
- Decision drift — Changes in outcome distribution (resolved/escalated/error)
- Tool sequence — Pattern changes in tool usage order
- Tool distribution — Frequency changes in which tools are called
- Latency — p95 latency shifts
- Error rate — Proportion of failed runs
- Retry rate — How often the agent retries operations
- Loop depth — Changes in iterative reasoning patterns
- Verbosity ratio — Output length relative to input
- Output length — Total token count in responses
- Time to first tool — How quickly the agent starts using tools
- Semantic drift — Heuristic clustering of output semantics
- Tool transitions — Changes in tool-to-tool call patterns
Each dimension is weighted based on your agent's inferred use case (e.g., customer support vs. code generation).
CLI Commands
Core Commands
# Connect to Langfuse and import traces
driftbase connect
# Detect drift automatically across all versions
driftbase diagnose
# Compare two specific versions
driftbase diff v1.0 v2.0
# View behavioral history over time
driftbase history
# Interactive setup guide
driftbase init
Advanced Commands
# Inspect individual runs
driftbase inspect <run-id>
# Export drift report as JSON
driftbase export --format json --output report.json
# Set up behavioral budgets
driftbase budgets set --dimension error_rate --threshold 0.05
# Prune old runs to save space
driftbase prune --before 2026-01-01
# Health check
driftbase doctor
CI/CD Integration
Driftbase integrates seamlessly into deployment pipelines to catch behavioral regressions before production.
Output Formats
# Rich terminal output (default)
driftbase diff v1.0 v2.0
# JSON for programmatic consumption
driftbase diff v1.0 v2.0 --format=json
# Markdown for PR comments
driftbase diff v1.0 v2.0 --format=markdown
Exit Codes
- Exit 0: SHIP or MONITOR verdicts (safe to deploy)
- Exit 1: REVIEW or BLOCK verdicts (manual review required)
Quick Start: GitHub Actions
# .github/workflows/drift-check.yml
name: Drift Check
on: [pull_request]
jobs:
drift:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.11"
- run: pip install driftbase
- run: driftbase diff v1.2.3 v1.3.0 --ci
env:
DRIFTBASE_DB_PATH: ./runs.db
The --ci flag enables:
- JSON output
- Non-zero exit on drift
- Compact formatting
Detailed Verdict Analysis
After a diff completes, use driftbase explain to see the full breakdown:
# Explain most recent verdict
driftbase explain
# Explain specific verdict by ID
driftbase explain abc-123-def
Shows:
- Top 3 contributing dimensions with evidence
- Confidence intervals and significance markers
- Minimum Detectable Effects (MDEs)
- Rollback target (for REVIEW/BLOCK verdicts)
PR Comment Integration
Post drift reports directly to pull requests:
- name: Generate drift report
run: |
OUTPUT=$(driftbase diff v1 v2 --format=markdown)
echo "report<<EOF" >> $GITHUB_OUTPUT
echo "$OUTPUT" >> $GITHUB_OUTPUT
echo "EOF" >> $GITHUB_OUTPUT
- uses: actions/github-script@v7
with:
script: |
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.name,
body: `${{ steps.drift.outputs.report }}`
})
Result: GitHub-flavored markdown table with top contributors, MDEs, and rollback targets.
Rollback on Regression
VERDICT=$(driftbase diff v1 v2 --format=json | jq -r .verdict)
ROLLBACK=$(driftbase diff v1 v2 --format=json | jq -r .rollback_target)
if [ "$VERDICT" = "BLOCK" ]; then
echo "Behavioral regression detected. Rolling back to $ROLLBACK"
kubectl set image deployment/agent agent=$ROLLBACK
exit 1
fi
See docs/ci-integration.md for GitLab CI, CircleCI, and advanced patterns.
Use Cases
1. Pre-Deploy Drift Gate (GitHub Action)
Add .github/workflows/drift-check.yml:
name: Drift Check
on:
pull_request:
branches: [main]
permissions:
pull-requests: write
contents: read
jobs:
drift-check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run Driftbase drift check
uses: driftbase-labs/driftbase-python/github-action@v1
with:
baseline-version: main
current-version: ${{ github.head_ref }}
fail-on-review: true
github-token: ${{ secrets.GITHUB_TOKEN }}
Posts a color-coded drift report as a PR comment with verdict (SHIP/MONITOR/REVIEW/BLOCK) and dimension breakdown.
See github-action/README.md for full documentation.
2. Post-Deploy Monitoring
#!/bin/bash
# Daily drift check (cron: 0 9 * * *)
export LANGFUSE_PUBLIC_KEY=...
export LANGFUSE_SECRET_KEY=...
driftbase connect --since $(date -d '1 day ago' +%Y-%m-%d)
driftbase diagnose --alert-on-drift
3. Incident Response
When users report unexpected agent behavior:
# Pull latest traces and diagnose
driftbase connect --since 2026-03-01
driftbase diagnose
# Inspect specific problematic run
driftbase inspect <run-id>
# Compare current vs. last known good
driftbase diff v2.0-stable v2.1-current
Configuration
Driftbase works out of the box with zero configuration. Optional settings:
# Set custom DB path
export DRIFTBASE_DB_PATH=/path/to/runs.db
# Set default Langfuse host
export LANGFUSE_HOST=https://your-instance.com
# Configure cost tracking
export DRIFTBASE_RATE_PROMPT_1M=2.50
export DRIFTBASE_RATE_COMPLETION_1M=10.00
# Reproducibility and sampling (Phase 1 correctness features)
export DRIFTBASE_SEED=42 # Random seed for reproducible drift reports (default: 42)
export DRIFTBASE_FINGERPRINT_LIMIT=5000 # Max runs per fingerprint (default: 5000)
export DRIFTBASE_BOOTSTRAP_ITERS=500 # Bootstrap iterations for confidence intervals (default: 500)
See docs/configuration.md for advanced settings.
Architecture
┌──────────────────────────────────────────────────────────────┐
│ YOUR AI AGENT │
│ (instrumented with Langfuse via any framework) │
└────────────────┬─────────────────────────────────────────────┘
│
│ traces
▼
┌──────────────────────────────────────────────────────────────┐
│ LANGFUSE │
│ (observability platform) │
└────────────────┬─────────────────────────────────────────────┘
│
│ driftbase connect
▼
┌──────────────────────────────────────────────────────────────┐
│ DRIFTBASE │
│ ├─ Local SQLite DB (runs, fingerprints, epochs) │
│ ├─ Drift analysis engine (12 dimensions) │
│ ├─ Baseline calibrator (auto-weights + thresholds) │
│ ├─ Anomaly detector (multivariate outliers) │
│ └─ Verdict engine (SHIP/MONITOR/REVIEW/BLOCK) │
└──────────────────────────────────────────────────────────────┘
Key principle: Driftbase is NOT a tracing tool. It's a drift detection layer that reads existing traces from Langfuse.
Roadmap
Completed:
- Langfuse connector with incremental sync
- LangSmith connector
- 12-dimension drift analysis
- Progressive weight learning from labeled deployments
- Statistical confidence tiers (TIER1/TIER2/TIER3)
- GitHub Action with standalone + cloud modes
- MCP server for Claude Desktop integration
- 60-second offline demo
Deferred (requires Cloud API):
- Privacy-first telemetry
- Opt-in data contribution for moat building
Future:
- Arize connector
- Generic OTEL ingestion
- Slack/PagerDuty alerting
- Web dashboard (Cloud tier)
Development
# Clone repo
git clone https://github.com/driftbase-labs/driftbase-python
cd driftbase-python
# Install in editable mode with dev dependencies
pip install -e '.[dev]'
# Run tests
pytest tests/
# Run linter
ruff check .
ruff format .
FAQ
Do I need to change my agent code?
No. Driftbase reads existing Langfuse traces. Your agent continues using Langfuse exactly as before.
Where is my data stored?
All analysis runs locally. Traces are stored in ~/.driftbase/runs.db (SQLite). Nothing leaves your machine unless you explicitly push to a remote backend (Pro tier feature).
What if I don't have Langfuse yet?
Set up Langfuse first: langfuse.com/docs/get-started. It takes ~10 minutes to instrument your agent with Langfuse, then you can use Driftbase.
What if I don't have historical traces?
Use driftbase testset generate to create synthetic baseline data, or start collecting traces now and compare future versions.
How often should I sync?
- Development: After every agent change
- Production: Daily or on-deploy via CI/CD
Does this work with LangSmith?
Yes! Driftbase supports both Langfuse and LangSmith. Use:
driftbase connect langsmith --project my-agent
Arize and generic OTEL support are planned for future releases.
Is this free?
Yes. The OSS SDK is free forever. We'll offer a Pro tier (hosted web dashboard, real-time alerting, team features) in the future, but the local CLI will always be free.
Support
- Docs: driftbase.io/docs
- Issues: github.com/driftbase-labs/driftbase-python/issues
- Discord: driftbase.io/discord
- Email: info@driftbase.io
License
Apache 2.0. See LICENSE.
Built with ❤️ for AI engineers who want to ship with confidence.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file driftbase-0.12.1rc1.tar.gz.
File metadata
- Download URL: driftbase-0.12.1rc1.tar.gz
- Upload date:
- Size: 429.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
14753abb6ccfb6e9031307728b31e5c89de6ba4dfc8c981daa87631bccf064ac
|
|
| MD5 |
066f5cf034d2b5a42a10a03101595d4b
|
|
| BLAKE2b-256 |
cf1fcf7369e35d6ad697a1bc14038ec13fc0566dbbb6024ecfa52432e0fc9b6c
|
Provenance
The following attestation bundles were made for driftbase-0.12.1rc1.tar.gz:
Publisher:
publish.yml on driftbase-labs/driftbase-python
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
driftbase-0.12.1rc1.tar.gz -
Subject digest:
14753abb6ccfb6e9031307728b31e5c89de6ba4dfc8c981daa87631bccf064ac - Sigstore transparency entry: 1358350497
- Sigstore integration time:
-
Permalink:
driftbase-labs/driftbase-python@76e4a18e652378cccb5ea32f31301b75ea77c702 -
Branch / Tag:
refs/tags/v0.12.1-rc.1 - Owner: https://github.com/driftbase-labs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@76e4a18e652378cccb5ea32f31301b75ea77c702 -
Trigger Event:
push
-
Statement type:
File details
Details for the file driftbase-0.12.1rc1-py3-none-any.whl.
File metadata
- Download URL: driftbase-0.12.1rc1-py3-none-any.whl
- Upload date:
- Size: 273.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
00a16223b365dd453eda524a1025c7dad07d9c7f162bc015bb35adbb11d979fd
|
|
| MD5 |
43c1d3d6bd1789d73afa1f64a63f973e
|
|
| BLAKE2b-256 |
6f13f04ef1e27837f70217eea19f82ee9bbec84f2f81ffcf1a3e68d8607d46cb
|
Provenance
The following attestation bundles were made for driftbase-0.12.1rc1-py3-none-any.whl:
Publisher:
publish.yml on driftbase-labs/driftbase-python
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
driftbase-0.12.1rc1-py3-none-any.whl -
Subject digest:
00a16223b365dd453eda524a1025c7dad07d9c7f162bc015bb35adbb11d979fd - Sigstore transparency entry: 1358350670
- Sigstore integration time:
-
Permalink:
driftbase-labs/driftbase-python@76e4a18e652378cccb5ea32f31301b75ea77c702 -
Branch / Tag:
refs/tags/v0.12.1-rc.1 - Owner: https://github.com/driftbase-labs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@76e4a18e652378cccb5ea32f31301b75ea77c702 -
Trigger Event:
push
-
Statement type: