Skip to main content

Local-first time-travel debugger for AI agent runs.

Project description

RunLens

RunLens is a local-first time-travel debugger for AI agents.

It records prompts, model calls, tool calls, state snapshots, errors, latency, token usage, and metadata into SQLite, then opens a dashboard where you can inspect the timeline, replay a run, fork from a bad step, patch the data, and compare the branch.

The core idea is simple: when an agent fails, you should not be staring at console logs and guessing. You should be able to see exactly what happened.

RunLens dashboard

Why It Exists

AI agents fail in messy ways:

  • a tool returns bad data,
  • a model makes an unsupported jump,
  • state gets corrupted halfway through a workflow,
  • a retry hides the first useful error,
  • the final answer looks wrong but the cause is five steps earlier.

RunLens gives those failures a shape: timeline, replay, fork, patch, diff.

Quick Start

Install from PyPI after release:

python -m pip install runlens-ai
runlens demo --mock --reset
runlens server

Install from a local checkout:

python -m pip install -e ".[dev]"
python -m runlens demo --mock --reset
python -m runlens server

Open:

http://127.0.0.1:8765

The built-in demo creates:

  • broken-research-agent: a failed run caused by weak search results.
  • repaired-research-agent: a completed fork after patching the bad tool output.

What To Try In The Demo

  1. Select broken-research-agent and click through each step to find the bad search_web result.
  2. Select repaired-research-agent and open Diff to see exactly which nested fields changed.
  3. Open Replay to inspect the deterministic timeline.
  4. Select step 02 search_web, edit the Fork Output JSON, and click Create to make another branch.
  5. Export a trace and import it again to confirm the portable .runlens.json workflow.

Python SDK

from runlens import trace

with trace("research-agent", tags=["demo"]) as run:
    run.log_prompt("Find credible RAG evaluation papers")
    run.log_tool_call(
        "search",
        input={"query": "RAG evaluation benchmark"},
        output={"results": []},
    )
    run.log_model_call(
        "writer",
        input={"sources": []},
        output={"draft": "No sources found."},
        total_tokens=88,
        latency_ms=231.4,
    )
    run.log_response({"status": "needs_repair"})

Tool decorator:

from runlens import trace

with trace("agent-with-tools") as run:
    @run.tool("lookup")
    def lookup(query: str) -> dict:
        return {"query": query, "result": "example"}

    lookup("RAGAS paper")

CLI

python -m runlens init
python -m runlens demo --mock --reset
python -m runlens runs
python -m runlens replay <run_id>
python -m runlens export <run_id> --out trace.json
python -m runlens import trace.json
python -m runlens doctor
python -m runlens server --host 127.0.0.1 --port 8765

If the console script is on your PATH, use runlens instead of python -m runlens.

For PyPI release, the distribution name is runlens-ai because runlens is already occupied on PyPI. The import and CLI remain:

pip install runlens-ai
runlens demo --mock --reset

Dashboard

The dashboard includes:

  • run list with statuses, latency, token totals, and fork markers,
  • ordered timeline of prompt/model/tool/state/error/response/check steps,
  • step detail panel with input/output/error JSON,
  • deterministic replay strip,
  • fork editor for patching step output,
  • deterministic or live-rerun-scaffold fork mode,
  • parent-vs-fork diff with status and token comparison,
  • nested changed paths inside input/output/error JSON,
  • browser import/export of .runlens.json traces,
  • responsive layout for desktop and mobile.

Adapters

Manual SDK logging works everywhere. Optional adapter helpers are included for common model clients:

python -m pip install ".[openai]"
python -m pip install ".[anthropic]"
python -m pip install ".[litellm]"

OpenAI helper:

from openai import OpenAI
from runlens import trace
from runlens.adapters.openai import RunLensOpenAI

client = OpenAI()

with trace("openai-agent") as run:
    wrapped = RunLensOpenAI(client, run)
    wrapped.chat_completions_create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Say hello"}],
    )

Runnable OpenAI example:

python -m pip install ".[openai]"
$env:OPENAI_API_KEY="sk-..."
python examples/openai_agent.py
python -m runlens server

LiteLLM helper:

from runlens import trace
from runlens.adapters.litellm import completion

with trace("litellm-agent") as run:
    completion(
        run,
        model="openai/gpt-4o-mini",
        messages=[{"role": "user", "content": "Say hello"}],
    )

Claude / Anthropic helper:

from anthropic import Anthropic
from runlens import trace
from runlens.adapters.anthropic import RunLensAnthropic

client = Anthropic()

with trace("claude-agent") as run:
    wrapped = RunLensAnthropic(client, run)
    wrapped.messages_create(
        model="claude-sonnet-4-5",
        max_tokens=500,
        messages=[{"role": "user", "content": "Say hello"}],
    )

Runnable Claude example:

python -m pip install ".[anthropic]"
$env:ANTHROPIC_API_KEY="sk-ant-..."
python examples/claude_agent.py
python -m runlens server

Graph node wrapper:

from runlens import trace
from runlens.adapters.langgraph import wrap_node

def plan_node(state: dict) -> dict:
    return {**state, "plan": ["search", "write", "check"]}

with trace("graph-agent") as run:
    wrapped_plan = wrap_node(run, "plan_node", plan_node)
    wrapped_plan({"task": "research RAG eval"})

Architecture

Python SDK / adapters
        |
        v
RunLensStore (SQLite)
        |
        +-- FastAPI JSON API
        |
        +-- React dashboard static bundle

Default database path:

.runlens/runlens.sqlite

Override it with:

$env:RUNLENS_DB="C:\path\to\runlens.sqlite"

Privacy Defaults

RunLens is local-first. It does not upload traces anywhere.

Redaction is applied before storage for common secret keys such as:

  • api_key
  • authorization
  • password
  • secret
  • token
  • cookie

Large strings and large containers are truncated before storage.

Production Guardrails

RunLens v0.1 includes practical hardening for local and demo deployments:

  • SQLite WAL mode and write busy timeout,
  • restricted default CORS origins,
  • configurable request body limit,
  • security headers on API and dashboard responses,
  • fork patch validation at both API and storage boundaries,
  • Docker image with healthcheck and persistent /data volume.

Configuration:

$env:RUNLENS_DB="C:\path\to\runlens.sqlite"
$env:RUNLENS_CORS_ORIGINS="https://your-demo.example.com"
$env:RUNLENS_MAX_REQUEST_BYTES="5000000"

Run with Docker:

docker build -t runlens-ai:0.1.0 .
docker run --rm -p 8765:8765 -v runlens-data:/data runlens-ai:0.1.0

Development

python -m pip install -e ".[dev]"
cd frontend
npm install
npm run build
cd ..
python -m pytest
python -m runlens demo --mock --reset
python -m runlens server

Frontend dev server:

cd frontend
npm run dev

The Vite dev server proxies /api to http://127.0.0.1:8765.

Publishing

Release docs live in:

  • docs/PUBLISHING.md
  • docs/DEPLOYMENT.md
  • docs/RELEASE_CHECKLIST.md
  • CHANGELOG.md
  • SECURITY.md
  • CONTRIBUTING.md

GitHub repository:

https://github.com/harshbhatia04/RUNLENS

v0.1 Status

Included:

  • local SQLite trace storage,
  • manual Python SDK,
  • OpenAI and LiteLLM adapter helpers,
  • FastAPI trace API,
  • React dashboard,
  • deterministic replay,
  • fork-with-patched-output,
  • parent-vs-fork diff,
  • trace import/export from CLI and browser,
  • nested JSON diff paths,
  • live-rerun scaffold boundary for patched forks,
  • built-in repaired demo,
  • tests and package metadata.

Not included yet:

  • hosted cloud sync,
  • auth,
  • billing,
  • automatic live rerun across arbitrary private frameworks without user runner code,
  • binary artifact storage.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

runlens_ai-0.1.0.tar.gz (191.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

runlens_ai-0.1.0-py3-none-any.whl (91.4 kB view details)

Uploaded Python 3

File details

Details for the file runlens_ai-0.1.0.tar.gz.

File metadata

  • Download URL: runlens_ai-0.1.0.tar.gz
  • Upload date:
  • Size: 191.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.14

File hashes

Hashes for runlens_ai-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c3ae80f2fbd1f3ae9816e582190a2733c44a08592f620549ce43990dc340497e
MD5 6f9fbc98d2b3fc1ab7b59db51b7a131a
BLAKE2b-256 cbe22cdafa0b6e6edabe741bc9a1b07d91f808d6798621ba6a01dc39e9c54426

See more details on using hashes here.

File details

Details for the file runlens_ai-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: runlens_ai-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 91.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.14

File hashes

Hashes for runlens_ai-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 43f8af222f2327ab6fccdb180be2804e72dcef450fe825b638b5903ce847103f
MD5 5dc96f7301682492ddcc0933c3317e81
BLAKE2b-256 54cc9810976af54176794fe243f61364d910a8fdd9420d487ec56ad757e763a4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page