Spanlens SDK for Python. Agent tracing, LLM usage capture, and cost observability.

These details have not been verified by PyPI

Project links

Project description

Spanlens Python SDK

LLM observability for Python. Trace agent runs, capture token usage and cost, and link calls back to your Spanlens dashboard with one line of code.

Spanlens is the open-source LLM observability platform. This is the official Python SDK. For the dashboard, signup, and proxy docs, head to spanlens.io.

Install

pip install spanlens

# Or with provider integrations:
pip install "spanlens[openai]"
pip install "spanlens[anthropic]"
pip install "spanlens[gemini]"
pip install "spanlens[langchain]"
pip install "spanlens[all]"

Fastest start: the `spanlens` CLI

Installing the package gives you a spanlens command. Run the wizard from your project root and it detects your package manager, validates your key, writes .env, and rewrites your OpenAI(...) / Anthropic(...) / genai.configure(...) calls to route through Spanlens.

pip install spanlens
spanlens init

Useful flags:

spanlens init --dry-run                 # preview every change, write nothing
spanlens init --yes --api-key sl_live_  # non-interactive (CI / scripts)
spanlens init --server-url https://...  # self-hosted Spanlens
spanlens test                           # just validate the key + connectivity

The wizard re-parses every file it touches before saving, so it never leaves you with code that will not import. Prefer to wire it up by hand? The two manual modes below are all it does under the covers.

Two ways to use it

Mode	Best for	Setup
Proxy	Single-call observability, drop-in for the OpenAI/Anthropic SDK	Replace `base_url`
SDK tracing	Multi-step agents, RAG, tool calls, manual spans	`SpanlensClient(...)`

You can mix both. The proxy logs the raw request; the SDK groups multiple requests into a single trace with parent / child spans.

Mode 1. Proxy (zero-code)

Get a Spanlens API key from your dashboard, then point your provider SDK at the Spanlens proxy:

import os
from spanlens.integrations.openai import create_openai

# Reads SPANLENS_API_KEY from the environment
client = create_openai()

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}],
)

Spanlens automatically logs the request, response, latency, token counts, and cost. View them in the dashboard under Requests.

Async (FastAPI, Django async views, asyncio)

Mirror helpers return the async client:

from spanlens.integrations.openai import create_async_openai
from spanlens.integrations.anthropic import create_async_anthropic

async def handler() -> str:
    client = create_async_openai()  # openai.AsyncOpenAI
    resp = await client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Hello!"}],
    )
    return resp.choices[0].message.content

The SDK's background ingest pool is thread-safe; you can fan out asyncio.gather of 50+ concurrent spans and trace/span POST ordering is preserved.

Tagging requests with a prompt version

from spanlens.integrations.openai import create_openai, with_prompt_version

client = create_openai()
res = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[...],
    **with_prompt_version("chatbot-system@3"),
)

The same pattern works for Anthropic. See spanlens.integrations.anthropic.

Mode 2. SDK tracing (multi-step agents)

Use the SDK when one user request spans multiple LLM calls, retrieval, tool use, etc. Spans appear nested under a single trace in the dashboard.

from spanlens import SpanlensClient

client = SpanlensClient(api_key="sl_live_...")

with client.start_trace("rag_pipeline", metadata={"user_id": "u_42"}) as trace:
    with trace.span("retrieve", span_type="retrieval") as span:
        docs = vector_store.similarity_search(query, k=5)
        span.end(output={"doc_count": len(docs)})

    with trace.span("generate", span_type="llm") as span:
        response = openai_client.chat.completions.create(
            model="gpt-4o-mini",
            messages=build_prompt(query, docs),
            extra_headers=span.trace_headers(),  # links proxy log to this span
        )
        usage = response.usage
        span.end(
            output=response.choices[0].message.content,
            prompt_tokens=usage.prompt_tokens,
            completion_tokens=usage.completion_tokens,
            total_tokens=usage.total_tokens,
        )

When a span / trace context manager exits with an exception, the span is automatically marked error with the exception message.

Helper: `observe_openai`

Boilerplate-free version of the LLM span. Auto-injects trace headers, auto-parses usage, and auto-ends the span:

from spanlens import observe_openai

result = observe_openai(trace, "answer", lambda headers:
    openai_client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages,
        extra_headers=headers,
    )
)

The same shape exists for Anthropic (observe_anthropic) and Gemini (observe_gemini).

Async support

observe() and observe_*() detect coroutines automatically. Pass an async callable and await the result:

async def go():
    result = await observe_openai(trace, "answer", lambda h:
        async_openai.chat.completions.create(..., extra_headers=h),
    )

Ollama (local LLMs)

observe_ollama() traces calls against a local Ollama instance. Use the OpenAI client pointed at Ollama's OpenAI-compatible endpoint, then wrap with the helper so the dashboard tags the span as provider: "ollama" instead of OpenAI:

from openai import OpenAI
from spanlens import SpanlensClient, observe_ollama

client = SpanlensClient(api_key="sl_live_...")
ollama = OpenAI(
    base_url="http://localhost:11434/v1",
    api_key="ollama",   # ignored by Ollama; required by the openai SDK
)

with client.start_trace("local_summarize") as trace:
    result = observe_ollama(trace, "llama3_summary", lambda h:
        ollama.chat.completions.create(
            model="llama3.1",
            messages=[{"role": "user", "content": "Summarize: ..."}],
            extra_headers=h,
        ),
    )

Cost is left as None because Ollama is self-hosted, so there is no per-token bill to compute.

LangChain / LangGraph

SpanlensCallbackHandler plugs into LangChain's standard BaseCallbackHandler contract, so it works for plain LangChain chains, LCEL pipelines, and LangGraph compiled graphs without code changes. Every LLM / chain / tool / retriever node becomes a span with the run-id tree mirroring the graph topology.

from spanlens import SpanlensClient
from spanlens.integrations.langchain import SpanlensCallbackHandler

client = SpanlensClient(api_key="sl_live_...")
handler = SpanlensCallbackHandler(client=client)

# LangChain / LCEL
result = chain.invoke({"input": "Hello"}, config={"callbacks": [handler]})

# LangGraph
graph = workflow.compile()
result = graph.invoke({"input": "Hello"}, config={"callbacks": [handler]})

Attach to an existing trace to nest the chain under a larger workflow:

with client.start_trace("agent_run") as trace:
    handler = SpanlensCallbackHandler(client=client, trace=trace)
    chain.invoke({"input": "..."}, config={"callbacks": [handler]})
    # ... other steps in the same trace ...

The handler depends on langchain-core at runtime. Either install the spanlens[langchain] extra above, or any LangChain extras you already use will bring it in.

Configuration reference

SpanlensClient(
    api_key="sl_live_...",        # required
    base_url=None,                 # default: https://spanlens-server.vercel.app
    timeout_ms=3000,               # ingest timeout per call
    silent=True,                   # swallow errors so observability never crashes user code
    on_error=None,                 # callback (err, context) for non-silent monitoring
)

Environment variables:

SPANLENS_API_KEY is picked up by create_openai(), create_anthropic(), and create_gemini() when api_key= is omitted.

Why the SDK is non-blocking

Every trace.end() / span.end() call returns immediately. Network I/O runs on a background thread pool with a configurable timeout, so:

Your hot path (the LLM call itself) is never slowed down.
The Spanlens server being slow / down does not crash your app.
Order is still preserved: a span POST always waits for its parent trace POST to finish, because the server's ownership check would otherwise 404 and the span would be silently lost.

For short-lived scripts, call client.close() before exit (or use with SpanlensClient(...) as client:) to drain the queue.

Compatibility

Python 3.9, 3.10, 3.11, 3.12, 3.13
openai >= 1.0
anthropic >= 0.18
google-generativeai >= 0.5

Self-hosting

Point the SDK and proxy helpers at your own deployment:

client = SpanlensClient(
    api_key="...",
    base_url="https://spanlens.mycompany.com",
)

openai = create_openai(base_url="https://spanlens.mycompany.com/proxy/openai/v1")

License

MIT. See LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.7.0

Jun 13, 2026

0.6.0

Jun 5, 2026

0.5.1

May 27, 2026

0.5.0

May 20, 2026

0.4.0

May 20, 2026

0.3.0

May 19, 2026

0.2.0

May 19, 2026

0.1.0

Apr 25, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spanlens-0.7.0.tar.gz (48.9 kB view details)

Uploaded Jun 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

spanlens-0.7.0-py3-none-any.whl (60.9 kB view details)

Uploaded Jun 13, 2026 Python 3

File details

Details for the file spanlens-0.7.0.tar.gz.

File metadata

Download URL: spanlens-0.7.0.tar.gz
Upload date: Jun 13, 2026
Size: 48.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for spanlens-0.7.0.tar.gz
Algorithm	Hash digest
SHA256	`3d0441dffec2122a1e9b6d5002fcdd5e11d8d2b6b8c2ee3286e96943c460a84d`
MD5	`4718fb15c230ce7913d9345eecf93bcd`
BLAKE2b-256	`04c058566aa5002999c99b2ce92dfa9076058e194ed994f1b199d4617d62981d`

See more details on using hashes here.

File details

Details for the file spanlens-0.7.0-py3-none-any.whl.

File metadata

Download URL: spanlens-0.7.0-py3-none-any.whl
Upload date: Jun 13, 2026
Size: 60.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for spanlens-0.7.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`efe8a3d074bae12dc1ee575328af63e7fcaf3d7a65bd7d7ece8ad57510b76881`
MD5	`919301f8f57859bff0b3315c4233c98c`
BLAKE2b-256	`7c75961170450d88e93887b921416f7968747bc61dbcc70fd7c8327fc0df08df`

See more details on using hashes here.

spanlens 0.7.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Spanlens Python SDK

Install

Fastest start: the spanlens CLI

Two ways to use it

Mode 1. Proxy (zero-code)

Async (FastAPI, Django async views, asyncio)

Tagging requests with a prompt version

Mode 2. SDK tracing (multi-step agents)

Helper: observe_openai

Async support

Ollama (local LLMs)

LangChain / LangGraph

Configuration reference

Why the SDK is non-blocking

Compatibility

Self-hosting

License

Links

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Fastest start: the `spanlens` CLI

Helper: `observe_openai`