When your AI agent breaks, AgentLens tells you exactly which decision caused it — and what to change.
Project description
AgentLens
When your AI agent breaks, AgentLens tells you exactly which decision caused it — and what to change.
Not logs. Not traces. The answer.
ROOT CAUSE: tool_selection
FAILED AT: Step 2 (search_web)
WHY: Both tools had identical descriptions — the agent treated them as
interchangeable and picked the wrong one.
FIX: Rewrite tool descriptions so search_web is clearly for external
lookup and query_db is clearly for local records.
CONFIDENCE: 0.90
Works with Anthropic · OpenAI · LangGraph · CrewAI · AutoGen · PydanticAI · raw API
Install
pip install agentlens-ai
Or from source:
git clone https://github.com/abishekgiri/agentlens.git
cd agentlens
pip install -e ".[anthropic,openai]"
Quickstart — 2 lines
Add AgentLens before your existing provider client. No other changes.
import agentlens
agentlens.init() # patches Anthropic + OpenAI automatically
import anthropic
client = anthropic.Anthropic() # captured from here on
@agentlens.run(name="my_agent") # groups everything into one run
def run_agent(query):
response = client.messages.create(
model="claude-3-5-sonnet-latest",
max_tokens=512,
messages=[{"role": "user", "content": query}]
)
return response
Async agents work exactly the same — agentlens.init() patches AsyncAnthropic and AsyncOpenAI too.
Run your agent, then:
agentlens runs list
agentlens diagnose <run_id>
Framework examples
LangGraph
import agentlens
agentlens.init()
agentlens.patch_langgraph() # call before graph.compile()
from langgraph.graph import StateGraph
graph = StateGraph(MyState)
graph.add_node("planner", planner_fn)
graph.add_node("executor", executor_fn)
app = graph.compile() # automatically wrapped — all nodes traced
@agentlens.run(name="langgraph_agent")
async def run(input):
return await app.ainvoke({"messages": input})
OpenAI async
import agentlens
agentlens.init()
from openai import AsyncOpenAI
client = AsyncOpenAI() # captured automatically
@agentlens.run(name="openai_agent")
async def run_agent(query):
response = await client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": query}],
tools=[],
)
return response
Multi-agent tracing
# Parent agent
ctx = agentlens.get_trace_context()
# Child agent (different process / service)
agentlens.init(parent_context=ctx) # stitches child trace into parent
What AgentLens catches
Six failure categories, detected automatically:
| Category | What it means |
|---|---|
tool_selection |
Agent picked the wrong tool — usually because descriptions were too similar |
loop |
Agent repeated the same tool call with the same inputs without exit |
cascade |
A tool returned bad/stale data and a downstream step used it and failed |
context_pollution |
Contradictory instructions in the prompt diluted the agent's goal |
state_drift |
Agent abandoned its original goal mid-run |
overflow |
Critical context was pushed out of the context window before the key decision |
Plus hallucination detection — invented tool parameters, missing required fields, LLM output that contradicts what the tool actually returned.
CLI reference
# Runs
agentlens runs list # all recent runs with status + span count
agentlens runs show <run_id> # full span detail for one run
agentlens runs view <run_id> # open visual timeline in browser
agentlens runs prompt <run_id> # print exact LLM prompts sent at each step
agentlens runs prompt <run_id> --step 2 # prompt for a specific LLM call only
agentlens runs replay <run_id> # interactive step-by-step playback (ENTER to advance)
agentlens runs stitch <run_id> # show multi-agent trace tree rooted at this run
# Diagnosis
agentlens diagnose <run_id> # root cause analysis + hallucination report
agentlens similar <run_id> # find historically similar failures
agentlens similar <run_id> --top 10 # top-N matches
agentlens clusters # failure clusters across all runs + top fix
# Stats & cost
agentlens stats # token usage, latency, cost across all runs
agentlens stats <run_id> # per-run breakdown
# Utilities
agentlens anonymize <run_id> # redact secrets before sharing
agentlens feedback-template <run_id> # structured feedback form
agentlens evaluate # accuracy check against fixtures + real cases
agentlens doctor # system health check
Real example output
AgentLens Diagnosis
===================
ROOT CAUSE:
cascade
FAILED AT:
Step 3 (get_user_profile)
WHY:
Step 3 produced bad or corrupted output that caused a failure at step 6.
get_user_profile returned {"email": null, "warning": "stale cache entry"}
and send_email downstream tried to use the null email field.
FIX:
Validate the output from 'get_user_profile' before using it downstream;
if it is stale, empty, or malformed, stop and recover instead of feeding
it into the next step.
SECONDARY:
None
CONFIDENCE: 0.90
HALLUCINATIONS DETECTED:
[HIGH] step 5 — invented param: 'send_email' was called with 'priority'
which is not in its schema. Valid params: ['to', 'body'].
How it works
agentlens.init() monkeypatches your provider clients at import time — no changes to existing code. Every LLM call, tool call, error, and memory snapshot is captured as a span and saved locally to .agentlens/runs/<run_id>.json.
agentlens diagnose runs the trace through a preprocessing pipeline, then either an LLM-powered classifier (if ANTHROPIC_API_KEY or OPENAI_API_KEY is set) or a fast local heuristic fallback. The local fallback works offline with no API key required.
All data stays on your machine. No cloud. No signup. No account.
Add a real-world test case
real_world_cases/my-broken-agent/
├── trace.json # anonymized run from .agentlens/runs/
├── expected_diagnosis.json # {"root_cause_category": "loop", "failed_at_step": 4}
└── notes.md # what the agent was doing and what actually broke
Run agentlens evaluate to score the diagnosis engine against your case.
What this is not
No dashboard. No hosted API. No database. No billing. No auth.
This is a local developer tool. The goal: when your agent breaks, run one command and get the answer in under 30 seconds.
Feedback
If AgentLens finds (or misses) a real bug in your agent, we want to know.
agentlens anonymize <run_id> # redact secrets
agentlens feedback-template <run_id> # fill this in and send it
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file runlens-0.1.0.tar.gz.
File metadata
- Download URL: runlens-0.1.0.tar.gz
- Upload date:
- Size: 51.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7f3aeeeb69e67cc0100b381b6241a0edd33d1ea892b0ef59ba177281ea0081d5
|
|
| MD5 |
f9ebc3ff474ff138e0acbc5c65f9d07c
|
|
| BLAKE2b-256 |
a5f17c2dfe002214bf435a049950b1ff3933168add0eb4e4edbc7534b3ad4dfb
|
File details
Details for the file runlens-0.1.0-py3-none-any.whl.
File metadata
- Download URL: runlens-0.1.0-py3-none-any.whl
- Upload date:
- Size: 53.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
46a5ce2a9c8545d8efb384cb141882e7a170a4d8b1a24ec1219621cee49e10e4
|
|
| MD5 |
acd13ff9e60405252e14e8b7d014b069
|
|
| BLAKE2b-256 |
6db81a666c6d870cf2f082a3f0518dfdba3af4dc07fe4f71d0aa99f6c728212b
|