Skip to main content

Debug AI Agents without burning money

Project description

Agent Flight Recorder ✈️

Debug your AI Agents without burning money. AgentFlightRecorder caches your function outputs so you can replay failed runs for free.

The Problem

AI API calls (OpenAI, Anthropic, etc.) are expensive. During debugging and testing, you often run the same prompts multiple times, wasting money. Agent Flight Recorder solves this by:

  • Recording API responses on first execution
  • Replaying cached responses on subsequent identical calls
  • Tracking costs and savings with detailed analytics
  • Supporting multiple providers (OpenAI, Anthropic)

How It Works

  1. Intercept: When you call a function or API, AFR sits in the middle.
  2. Fingerprint: It creates a unique "hash" based on the inputs (e.g., the prompt sent to OpenAI).
  3. Check Cache:
    • Found? Returns the saved result instantly (0 cost, 0 latency). ✈️
    • New? Executes the real function, saves the result, and returns it. 🔴

This means you can run your test suite 100 times, but only pay for the API calls once.

Installation

pip install agent-flight-recorder

Quick Start

Basic Usage (Without API Integration)

from agent_flight_recorder import Recorder

# Initialize recorder
recorder = Recorder(save_dir="./afr_logs")

# Decorate your expensive function
@recorder.trace(session_id="my_session")
def expensive_computation(n):
    print(f"Computing factorial of {n}...")
    result = 1
    for i in range(1, n + 1):
        result *= i
    return result

# First run: Executes and records
result1 = expensive_computation(5)  # Takes time
# 🔴 [RECORD] Running live function: expensive_computation...

# Second run: Replayed instantly!
result2 = expensive_computation(5)  
# ✈️  [REPLAY] Loaded cached result for expensive_computation

# View statistics
recorder.stats()

OpenAI Integration

from agent_flight_recorder import Recorder
import openai

# Initialize recorder with OpenAI provider
recorder = Recorder(
    save_dir="./afr_logs",
    providers=["openai"]
)

# Your OpenAI calls are now recorded & replayed!
response1 = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "user", "content": "Hello!"}],
    session_id="my_session"
)

# Identical second call uses cache (no API cost!)
response2 = openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "user", "content": "Hello!"}],
    session_id="my_session"
)

# View cost savings
recorder.stats("my_session")

Anthropic Integration

from agent_flight_recorder import Recorder
from anthropic import Anthropic

# Initialize with Anthropic provider
recorder = Recorder(
    save_dir="./afr_logs",
    providers=["anthropic"]
)

# Your Anthropic calls are automatically intercepted
client = Anthropic()
response1 = client.messages.create(
    model="claude-3-opus-20240229",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}],
    session_id="my_session"
)

# Replay the cached response
response2 = client.messages.create(
    model="claude-3-opus-20240229",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}],
    session_id="my_session"
)

Configuration

Recorder Parameters

recorder = Recorder(
    save_dir="./afr_logs",           # Where to store recordings
    mode="auto",                      # "auto" (record then replay), "record-only", "replay-only"
    providers=["openai", "anthropic"],# List of providers to intercept
    storage_backend="sqlite"          # "sqlite" (recommended) or "json"
)

Trace Decorator Options

@recorder.trace(
    session_id="my_test",      # Unique session identifier (required)
    ttl=3600,                  # Time-to-live in seconds (None = never expire)
    version="v1"               # Version string to invalidate old caches
)
def my_function():
    return expensive_api_call()

API Reference

Recorder Methods

  • trace(session_id, ttl, version) - Decorator to record/replay function calls
  • stats(session_id) - Display analytics and cost savings
  • clear(session_id) - Clear cached recordings
  • deactivate() - Disable all interceptors
  • diff(session_id_1, session_id_2) - Compare two sessions

Command-Line Interface

# View statistics for all sessions
afr stats

# View statistics for specific session
afr stats --session my_session

# List all recorded sessions
afr list

# Clear cache for specific session
afr clear --session my_session

# Clear all caches
afr clear

# Compare two sessions
afr diff --session session_1 --compare session_2

Storage Backends

SQLite (Recommended)

  • Pros: Fast, indexed queries, built-in TTL support, cost tracking
  • Cons: Binary format (less portable)
recorder = Recorder(storage_backend="sqlite")

JSON

  • Pros: Human-readable, portable, easy to inspect
  • Cons: Slower for large datasets, no TTL support
recorder = Recorder(storage_backend="json")

Cost Estimation

AgentFlightRecorder automatically estimates API costs based on token usage:

  • OpenAI: Estimates based on GPT-4, GPT-4 Turbo, GPT-3.5-turbo pricing
  • Anthropic: Estimates based on Claude 3 pricing
  • Custom: Add your own cost estimators

Example output:

==================================================
📊 AGENT FLIGHT RECORDER STATISTICS
==================================================
💰 Cost Saved: $3.24
📈 Total Cost Spent: $12.56
🔴 Live API Calls: 15
✈️  Replayed Calls: 42
⚡ Replay Rate: 73.7%
==================================================

Best Practices

1. Use Session IDs Strategically

# Good: Different sessions for different test suites
@recorder.trace(session_id="test_suite_1")
def test_function_1():
    pass

@recorder.trace(session_id="test_suite_2")
def test_function_2():
    pass

2. Version Your Caches

# When you change prompts, update version to invalidate old caches
@recorder.trace(session_id="chat_tests", version="v2")  # Invalidates v1 cache
def ask_ai(question):
    return openai.ChatCompletion.create(...)

3. Set Appropriate TTL

# Short TTL for data that changes frequently
@recorder.trace(session_id="live_data", ttl=300)  # 5 minutes
def fetch_live_data():
    return api.get_current_data()

# No TTL for stable test data
@recorder.trace(session_id="test_data")
def get_test_fixture():
    return api.get_fixture()

4. Use Replay-Only Mode for CI/CD

# In CI/CD, avoid unnecessary API calls
recorder = Recorder(
    save_dir="./afr_logs",
    mode="replay-only"  # Fail fast if cache is missing
)

Examples

See the examples/ directory for:

  • example_basic.py - Simple function recording
  • example_openai.py - OpenAI API integration
  • example_anthropic.py - Anthropic API integration
  • example_langchain.py - LangChain integration
  • autogpt_example.py - AutoGPT agent recording

Troubleshooting

Issue: "No cached result found" in replay-only mode

Solution: Make sure you recorded sessions first in "auto" mode, then switch to "replay-only".

Issue: Cache not being used

Possible causes:

  • Different function arguments (including kwargs order)
  • Different version parameter
  • TTL expired
  • Using different session_id

Issue: Anthropic interceptor not working

Solution: Ensure Anthropic is installed:

pip install anthropic

Contributing

Contributions welcome! Open issues or PRs on GitHub.

License

MIT License - See LICENSE file for details.

Disclaimer

Agent Flight Recorder caches API responses. Be aware that:

  • Cached responses may become outdated
  • API errors are replayed as-is
  • Use version parameter to invalidate caches when needed
  • Always test with fresh API calls before production deployment

Support

  • Documentation: Check the README and examples
  • Issues: Report bugs on GitHub Issues
  • Questions: Open a discussion or see examples folder

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agent_flight_recorder-0.2.0.tar.gz (26.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agent_flight_recorder-0.2.0-py3-none-any.whl (25.5 kB view details)

Uploaded Python 3

File details

Details for the file agent_flight_recorder-0.2.0.tar.gz.

File metadata

  • Download URL: agent_flight_recorder-0.2.0.tar.gz
  • Upload date:
  • Size: 26.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for agent_flight_recorder-0.2.0.tar.gz
Algorithm Hash digest
SHA256 467e325cd0509f020afc2b5147c2f8e35b52084148d6570f03be6cf0e961d9fe
MD5 ae281542d84836fc5930565d93616040
BLAKE2b-256 47ceced94ae1a77e226d49915c410ef4dca363f3bf3b85aa3b619ca61b775a10

See more details on using hashes here.

File details

Details for the file agent_flight_recorder-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for agent_flight_recorder-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a4d7d317020ae5c515230116dbfb50a695f1360ad57a3beda9aafcb97a4a2aec
MD5 3ab32ba70c097ce7feceaeb33afeaac6
BLAKE2b-256 0ae9e6d39c7271fa4511844305834cc2ef4a2c4f75b4fde71c8b9730b540dece

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page