Universal record-and-replay for LLM agents.

These details have not been verified by PyPI

Project description

AgentLab

Universal record-and-replay for LLM agents.

Status: pre-alpha, APIs will change.

AgentLab captures model calls, tools, state transitions, and timing into a trace you can replay without hitting the network. It is built around a framework-agnostic core and an HTTP capture layer that works with any SDK that routes requests through httpx.

Overhead

Per-LLM-call cost of running inside agentlab.record():

metric	baseline	recorded	overhead
latency p50	13.5 ms	14.7 ms	+1.16 ms
latency p99	14.4 ms	15.9 ms	+1.52 ms

Measured against an in-process loopback HTTP server with a 10 ms upstream delay (eliminates network jitter so the delta isolates SDK overhead: HTTP capture, span emit, JSONL write+fsync, matcher, LLMSpan build). Real LLM calls land in the 100 ms – 2000 ms range, so this works out to under 1% wall-clock overhead in practice.

Reproduce with:

uv run python scripts/bench_record_overhead.py --calls 200 --runs 5

Installation

pip install agentic-lab           # minimal SDK
pip install 'agentic-lab[ui]'     # + Starlette UI server

The PyPI distribution is agentic-lab; the importable Python module is agentlab:

import agentlab as al

For local development, this repo is uv-managed:

git clone https://github.com/ambuj-krishna-agrawal/agent-lab.git
cd agent-lab
uv sync --all-extras --frozen

Use --frozen by default so your environment matches uv.lock and CI.

Documentation

Quickstart — five minutes from install to a replayable trace.
Provider coverage — every supported LLM provider + how to add custom ones.
Error reference — every AGL-… code with a remediation sentence (auto-generated from src/agentlab/errors.py).
Changelog — version history.
AGENTS.md — invariants and quality gates contributors must respect.
CONTRIBUTING.md — human-contributor process.

Configuration

Secrets live in .env (git-ignored). Copy .env.example and set the provider keys you use.
Non-secret defaults live in src/agentlab/_defaults.toml and can be overridden by AGENTLAB_* environment variables.
Full typed config lives in src/agentlab/config.py.

Quickstart

Five minutes from pip install to a trace you can replay without an API key. The full runnable script lives at example/quickstart.py; the inline version:

import os
import openai
import agentlab as al

client = openai.OpenAI(
    api_key=os.environ["OPENROUTER_API_KEY"],
    base_url="https://openrouter.ai/api/v1",
)

# 1. Record.
with (
    al.record(agent_name="quickstart") as recording,
    al.agent(name="quickstart", version="0"),
    al.step(role=al.StepRole.EXECUTE),
):
    response = client.chat.completions.create(
        model="openai/gpt-4o-mini",
        messages=[{"role": "user", "content": "Reply with the single word 'ok'."}],
        max_tokens=16,
    )
print("model said:", response.choices[0].message.content)
print("trace at:  ", recording.directory)

# 2. Replay — no network, no key.
with al.replay(str(recording.directory)) as session:
    replay = client.chat.completions.create(
        model="openai/gpt-4o-mini",
        messages=[{"role": "user", "content": "Reply with the single word 'ok'."}],
        max_tokens=16,
    )
print("replay said:", replay.choices[0].message.content)
print("cache hits: ", session.cache_hits)

pip install 'agentic-lab[ui]' openai
export OPENROUTER_API_KEY=sk-or-...
python example/quickstart.py
agentlab serve --root ~/.agentlab/traces
# → http://127.0.0.1:7861/

The with al.agent(...) and al.step(...) envelopes give the auto-emitted LLMSpan a typed parent (the V4 schema forbids LLM under bare RUN). Production agents normally establish these once near their entrypoints and don't repeat them per-call — see example/workflows/ for that shape.

Larger example agents

Three reference agents under example/ cover the Anthropic building-effective-agents shapes:

Folder	Shape	What it does
`workflows/`	Workflow (fixed code path)	Decompose → Wikipedia search → cite → LLM-as-judge → revise.
`autonomous/`	Autonomous (model picks each step)	LangGraph observe-plan-act loop that triages support tickets.
`hybrid/`	Workflow + autonomous sub-agent	Incident-response pipeline with autonomous investigation step.

All three use OpenRouter via langchain-openai, real (or realistic) tools, and produce traces directly into example_traces/ that agentlab serve can browse.

Provider coverage

Inside an agentlab.record() block AgentLab patches httpx transport methods, so every SDK that routes through httpx (which is most modern Python LLM SDKs) lands its raw exchange in http.jsonl. That file is the source of truth for replay; the typed LLMSpan is a best-effort view layered on top.

The built-in matchers turn recognised exchanges into typed LLMSpans out of the box:

Provider	Endpoint(s)	Stream?
OpenAI chat completions	`api.openai.com/v1/chat/completions`	yes
OpenAI Responses	`api.openai.com/v1/responses`	yes
OpenAI Embeddings	`api.openai.com/v1/embeddings`	n/a
Azure OpenAI chat completions	`*.openai.azure.com/openai/deployments/<dep>/chat/completions`	yes
Anthropic Messages	`api.anthropic.com/v1/messages`	yes
AWS Bedrock — Invoke	`bedrock-runtime.<region>.amazonaws.com/model/<id>/invoke[-with-response-stream]`	partial[^1]
AWS Bedrock — Converse	`bedrock-runtime.<region>.amazonaws.com/model/<id>/converse[-stream]`	partial[^1]
Google Gemini	`generativelanguage.googleapis.com/.../models/<m>:[stream]generateContent`	yes
Vertex AI — Gemini	`<region>-aiplatform.googleapis.com/.../models/<m>:[stream]generateContent`	yes
Vertex AI — Anthropic (Claude)	`<region>-aiplatform.googleapis.com/.../models/<m>:[stream]rawPredict`	yes
OpenRouter	`openrouter.ai/api/v1/chat/completions`	yes
Together AI	`api.together.{xyz,ai}/v1/chat/completions`	yes
Groq	`api.groq.com/openai/v1/chat/completions`	yes
Mistral	`api.mistral.ai/v1/chat/completions`	yes
Fireworks	`api.fireworks.ai/inference/v1/chat/completions`	yes
DeepInfra	`api.deepinfra.com/v1/openai/chat/completions`	yes
Perplexity	`api.perplexity.ai/chat/completions`	yes

[^1]: Bedrock streaming uses AWS event-stream binary framing. Buffered responses populate every LLMSpan field; streamed responses record the request side and a validation_errors entry explaining why the response side is empty. The raw bytes are still preserved in http.jsonl.

Adding a custom or self-hosted provider

OpenAI-compatible hosts (vLLM, Ollama, your private gateway) need one line:

import agentlab as al
from agentlab.llm.matchers.openai import HostPathMatcher

al.register_llm_provider(HostPathMatcher(
    name="my-vllm",
    host_suffix="llm.internal.example.com",
    path_prefix="/v1/chat/completions",
))

For wholly different body shapes, subclass agentlab.llm.LLMProviderMatcher.

Pricing

The SDK is token-only by default — LLMSpan.cost.usd stays at 0.0 and the span is annotated with agentlab.llm.pricing.unknown=True. Provider list-prices change too often to bake into the SDK. Operators who want USD computed on every span install their own table:

from agentlab.llm.pricing import PriceRow, StaticPriceTable, set_price_table

set_price_table(StaticPriceTable(rows=(
    PriceRow("openai", "gpt-4o", 2.50, 10.00),
    PriceRow("anthropic", "claude-3-5-sonnet*", 3.00, 15.00),
)))

Strict mode for unrecognised exchanges

By default, exchanges that don't match any provider matcher log a warning (one per (trace, host)) and the raw exchange remains in http.jsonl. Power users can opt into stricter behaviour:

with al.record(strict_unknown_provider="raise"):  # or "emit_op"
    ...

"raise" surfaces the gap as UnknownLLMProviderError; "emit_op" records the call as a typed OpSpan so the trace tree is complete even without a matcher.

UI and examples

Run the backend UI server against bundled traces:

uv run agentlab --root example_traces serve --port 7861

Optional frontend dev server with HMR:

cd frontend
npm install
npm run dev

The bundled runnable agents are seeded from example/ and are available from the Agents page when the server starts successfully.

Production deployment

The OSS UI server can be hosted on a single EC2 box behind Caddy, with a separate Next.js + Clerk marketing/auth site on Vercel that redirects authenticated users to it. See deploy/README.md for the end-to-end runbook.

UI walkthrough

Dashboard

Traces list

Trace detail

Agents

Settings

Development

Run the local quality gate:

bash scripts/check.sh

Equivalent commands:

uv run ruff check .
uv run ruff format --check .
uv run mypy
uv run pytest tests/unit tests/integration -n auto --dist=worksteal

Testing

Current test tiers:

tests/unit/: hermetic unit tests (no real network).
tests/integration/: in-process integration tests with mocked HTTP where needed.

For live-provider smoke runs, use the runnable examples in example/ through their CLIs or the UI Agents page.

Project layout

agentlab/
├── src/agentlab/
│   ├── __init__.py          # public API surface
│   ├── cli.py               # `agentlab` console entry point
│   ├── config.py            # typed settings
│   ├── recorder.py          # public `record()` context manager
│   ├── _defaults.toml       # bundled non-secret defaults
│   ├── _proto/              # generated protobuf bindings (private)
│   ├── bridges/             # export bridges (e.g. OTel GenAI)
│   ├── core/                # recording primitives
│   ├── io/                  # trace IO + HTTP capture
│   ├── integrations/        # framework adapters
│   ├── llm/                 # provider-agnostic LLM client
│   ├── replay/              # deterministic replay engine
│   ├── storage/             # JSONL + protobuf stores
│   ├── ui/                  # Starlette UI server + DTO mapping
│   ├── pytest.py            # pytest plugin
│   └── promote.py           # replay-test scaffold generator
├── frontend/                # React SPA for the UI server
├── example/                 # bundled runnable agent seeds
├── proto/agentlab/v1/trace.proto
├── scripts/                 # check, proto regen, UI screenshot helpers
├── tests/{unit,integration}/
└── uv.lock

License

Apache 2.0 — see LICENSE.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

May 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentic_lab-0.1.0.tar.gz (654.8 kB view details)

Uploaded May 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agentic_lab-0.1.0-py3-none-any.whl (720.0 kB view details)

Uploaded May 13, 2026 Python 3

File details

Details for the file agentic_lab-0.1.0.tar.gz.

File metadata

Download URL: agentic_lab-0.1.0.tar.gz
Upload date: May 13, 2026
Size: 654.8 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for agentic_lab-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`f092a26d61d645bb5d6ecf33709f05f0e10afe61bb9d624ca51f61d34df91358`
MD5	`ef1888f8d27ad10c022654f4b1ee3d54`
BLAKE2b-256	`196656405d21216c77ab2c7394b5d7200bf5343a3d988d5810a7092bec241c47`

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentic_lab-0.1.0.tar.gz:

Publisher: release.yml on ambuj-krishna-agrawal/agent-lab

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: agentic_lab-0.1.0.tar.gz
- Subject digest: f092a26d61d645bb5d6ecf33709f05f0e10afe61bb9d624ca51f61d34df91358
- Sigstore transparency entry: 1522898639
- Sigstore integration time: May 13, 2026
Source repository:
- Permalink: ambuj-krishna-agrawal/agent-lab@026eec48475d25ac57ec502ec1782df4e70f5922
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/ambuj-krishna-agrawal
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@026eec48475d25ac57ec502ec1782df4e70f5922
- Trigger Event: push

File details

Details for the file agentic_lab-0.1.0-py3-none-any.whl.

File metadata

Download URL: agentic_lab-0.1.0-py3-none-any.whl
Upload date: May 13, 2026
Size: 720.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for agentic_lab-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6b3d3845f630dc143d4abab57818d036b2c6f063dd398ce7d232103c8391128e`
MD5	`b9db8b459d86e009cf630d9a9b42daab`
BLAKE2b-256	`afdce64ccb63cf4d8b1c20d6cdc7baed46a63ff2a98deccaad8be48276a2cf26`

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentic_lab-0.1.0-py3-none-any.whl:

Publisher: release.yml on ambuj-krishna-agrawal/agent-lab

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: agentic_lab-0.1.0-py3-none-any.whl
- Subject digest: 6b3d3845f630dc143d4abab57818d036b2c6f063dd398ce7d232103c8391128e
- Sigstore transparency entry: 1522898645
- Sigstore integration time: May 13, 2026
Source repository:
- Permalink: ambuj-krishna-agrawal/agent-lab@026eec48475d25ac57ec502ec1782df4e70f5922
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/ambuj-krishna-agrawal
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@026eec48475d25ac57ec502ec1782df4e70f5922
- Trigger Event: push

agentic-lab 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

AgentLab

Overhead

Installation

Documentation

Configuration

Quickstart

Larger example agents

Provider coverage

Adding a custom or self-hosted provider

Pricing

Strict mode for unrecognised exchanges

UI and examples

Production deployment

UI walkthrough

Dashboard

Traces list

Trace detail

Agents

Settings

Development

Testing

Project layout

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance