Skip to main content

Auto-instrumentation and visibility for AI agents — OpenAI Agents SDK, Claude SDK, and more.

Project description

mimir-observe

Auto-instrumentation and visibility for AI agents. Two lines of code, zero config.

pip install mimir-observe

Then run:

mimir quickstart     # getting-started guide with copy-paste snippets
mimir dashboard      # start the local dashboard at http://localhost:9847

Quick start

1. Add instrumentation (2 lines)

Pick the one that matches your stack:

import mimir

# Raw OpenAI client (chat.completions.create)
mimir.instrument_openai()

# Raw Anthropic client (messages.create)
mimir.instrument_anthropic()

# OpenAI Agents SDK (Runner.run / Runner.run_streamed)
mimir.instrument_openai_agents()

# Claude Agent SDK (query)
mimir.instrument_claude()

Add these lines at the top of your entry point, before any API calls. That's it. Your existing code stays exactly the same.

2. Start the dashboard

In a separate terminal:

python -m mimir.cli dashboard

Open http://localhost:9847 to see your runs.

3. There is no step 3

Every API call and agent run is now captured automatically. The dashboard shows:

  • Agent list with run counts, models, and tools
  • Run timeline with every tool call (args + results), reasoning block, and token usage
  • Run diffing -- side-by-side comparison of any two runs
  • Deep Dive -- multi-run comparison grid with step alignment
  • Divergence detection -- flags agents whose reruns follow different tool patterns
  • AI Analysis -- click "Analyze" on any run for an AI-powered trace breakdown

What gets captured

Data How
Tool calls Name, arguments, result, duration
Reasoning Model output text between tool calls
Token usage Input/output tokens per call
Cost If set via run.set_cost()
Run duration Wall clock time
Run status Success or error
Input/output Prompt and final result

Which instrument function do I use?

Your code uses Function
from openai import OpenAI mimir.instrument_openai()
from anthropic import Anthropic mimir.instrument_anthropic()
from agents import Runner mimir.instrument_openai_agents()
from claude_code_sdk import query mimir.instrument_claude()

You can call multiple if your project uses more than one SDK.

Multi-turn agentic loops

If your agent calls the API multiple times in a loop, wrap it with mimir.trace() so all calls are grouped as one named run:

import mimir
mimir.instrument_openai()  # or instrument_anthropic()

from openai import OpenAI
client = OpenAI()

with mimir.trace("Migration Planner"):
    # Every API call inside here becomes a step in one run
    response = client.chat.completions.create(model="gpt-4o", messages=[...])
    response = client.chat.completions.create(model="gpt-4o", messages=[...])
    response = client.chat.completions.create(model="gpt-4o", messages=[...])

This is important when you have multiple agents using the same model — without trace(), they all get lumped together. Each trace("name") creates a distinct agent on the dashboard.

Without the wrapper, each API call creates its own run — fine for single calls, wrong for loops.

How it works

Mimir monkey-patches the SDK at the class level when you call instrument_*(). Every subsequent API call is intercepted, telemetry is extracted from the request/response, and it's sent to the local dashboard via fire-and-forget HTTP. Your agent code is never blocked or slowed down.

  • Zero external dependencies (stdlib only)
  • All data stays local (~/.mimir/)
  • Dashboard down? Agent runs normally, no errors
  • Uninstrument anytime: mimir.uninstrument_openai(), etc.

Manual instrumentation

For custom setups where auto-instrumentation doesn't fit:

import mimir

t = mimir.task(
    name="My Agent",
    config="what it does",
    tools=["search", "write"],
    model="gpt-4o",
)

with t.run(input={"prompt": "user input"}) as run:
    run.tool("search", {"q": "test"}, "3 results", duration_ms=150)
    run.reasoning("Found relevant results, writing report...")
    run.tool("write", {"file": "report.md"}, "ok", duration_ms=50)
    run.set_usage(1500, 800)
    run.set_output("Report written")

Onboarding with Claude Code

If you use Claude Code, paste this prompt to have it instrument your project automatically:

Install and set up Mimir agent observability in this project.

Step 1: pip install mimir-observe (if not already installed). Import as `import mimir`.

Step 2: Find the entry point(s) and determine which SDK is used:
  - `from openai import OpenAI` → add `mimir.instrument_openai()`
  - `from anthropic import Anthropic` → add `mimir.instrument_anthropic()`
  - `from agents import Runner` → add `mimir.instrument_openai_agents()`
  - `from claude_code_sdk import query` → add `mimir.instrument_claude()`

Add the 2 lines (import + instrument call) at the top of each entry point,
BEFORE any API calls. No other code changes needed.

Step 3: If the code has multi-turn agentic loops (calling the API multiple times
in a while/for loop), wrap each agent's loop with mimir.trace("Agent Name") so
all turns become steps in one run instead of separate runs:

    with mimir.trace("Migration Planner"):
        # ... the existing loop goes here, unchanged ...

Each distinct agent should get its own mimir.trace() wrapper with a unique name.
Single API calls outside a loop do NOT need this wrapper — they auto-create runs.

Step 4: Start the dashboard: python -m mimir.cli dashboard

AI Analysis

Click the Analyze button on any run in the dashboard for an AI-powered breakdown covering:

  • Plain English summary of what the agent did
  • Efficiency analysis (redundant steps, wasted tokens)
  • Cost breakdown by step
  • Red flags (loops, repeated failures, excessive reasoning)
  • Concrete improvement suggestions

Zero config — uses your existing OPENAI_API_KEY or ANTHROPIC_API_KEY from the environment. Your key is only used locally and goes directly from your machine to the LLM provider. No data passes through any third-party server.

The sidebar shows whether AI Analysis is available. If you don't see it, start the dashboard from the same terminal where your API key is set.

CLI

mimir quickstart                   # getting-started guide + Claude Code onboarding prompt
mimir dashboard                    # start dashboard on :9847
mimir dashboard --port 8080        # custom port
mimir version                      # print version

All commands also work as python -m mimir.cli <command>.

Requirements

  • Python 3.10+
  • No external dependencies

The SDKs you want to instrument (openai, anthropic, openai-agents, etc.) must be installed separately.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mimir_observe-0.9.1.tar.gz (52.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mimir_observe-0.9.1-py3-none-any.whl (49.6 kB view details)

Uploaded Python 3

File details

Details for the file mimir_observe-0.9.1.tar.gz.

File metadata

  • Download URL: mimir_observe-0.9.1.tar.gz
  • Upload date:
  • Size: 52.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for mimir_observe-0.9.1.tar.gz
Algorithm Hash digest
SHA256 c9c9057a221266ba7d7e4d7707480e592b1a1ee1e04eee79e28bf7ee6381ff5b
MD5 04270c56fefbd312a943b34b5cd8c4b8
BLAKE2b-256 a68b17174ef513e8ee93af7b33b3b535a50d8d8c94f4c2dcec5298c969667be6

See more details on using hashes here.

File details

Details for the file mimir_observe-0.9.1-py3-none-any.whl.

File metadata

  • Download URL: mimir_observe-0.9.1-py3-none-any.whl
  • Upload date:
  • Size: 49.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for mimir_observe-0.9.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0a125db53ec6974d58388f40e4a6b703ee8e4a078754c3fec9503be02c8ab851
MD5 6951a9fcb4e833f41c17111aec2639ee
BLAKE2b-256 cc976acd0cca611139b2c767a955276a0f16d6a5e930add48350134c6dee3f81

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page