Skip to main content

Spanlens SDK — agent tracing, LLM usage capture, and cost observability for Python.

Project description

Spanlens Python SDK

LLM observability for Python. Trace agent runs, capture token usage and cost, and link calls back to your Spanlens dashboard with one line of code.

PyPI License: MIT Python

Spanlens is the open-source LLM observability platform. This is the official Python SDK — for the dashboard, signup, and proxy docs, head to spanlens.io.


Install

pip install spanlens

# Or with provider integrations:
pip install "spanlens[openai]"
pip install "spanlens[anthropic]"
pip install "spanlens[gemini]"
pip install "spanlens[all]"

Two ways to use it

Mode Best for Setup
Proxy Single-call observability — drop-in for the OpenAI/Anthropic SDK Replace base_url
SDK tracing Multi-step agents, RAG, tool calls, manual spans SpanlensClient(...)

You can mix both. The proxy logs the raw request; the SDK groups multiple requests into a single trace with parent / child spans.


Mode 1 — Proxy (zero-code)

Get a Spanlens API key from your dashboard, then point your provider SDK at the Spanlens proxy:

import os
from spanlens.integrations.openai import create_openai

# Reads SPANLENS_API_KEY from the environment
client = create_openai()

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}],
)

Spanlens automatically logs the request, response, latency, token counts, and cost — viewable in the dashboard under Requests.

Async (FastAPI, Django async views, asyncio)

Mirror helpers return the async client:

from spanlens.integrations.openai import create_async_openai
from spanlens.integrations.anthropic import create_async_anthropic

async def handler() -> str:
    client = create_async_openai()  # openai.AsyncOpenAI
    resp = await client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Hello!"}],
    )
    return resp.choices[0].message.content

The SDK's background ingest pool is thread-safe; you can fan out asyncio.gather of 50+ concurrent spans and trace/span POST ordering is preserved.

Tagging requests with a prompt version

from spanlens.integrations.openai import create_openai, with_prompt_version

client = create_openai()
res = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[...],
    **with_prompt_version("chatbot-system@3"),
)

The same pattern works for Anthropic — see spanlens.integrations.anthropic.


Mode 2 — SDK tracing (multi-step agents)

Use the SDK when one user request spans multiple LLM calls, retrieval, tool use, etc. Spans appear nested under a single trace in the dashboard.

from spanlens import SpanlensClient

client = SpanlensClient(api_key="sl_live_...")

with client.start_trace("rag_pipeline", metadata={"user_id": "u_42"}) as trace:
    with trace.span("retrieve", span_type="retrieval") as span:
        docs = vector_store.similarity_search(query, k=5)
        span.end(output={"doc_count": len(docs)})

    with trace.span("generate", span_type="llm") as span:
        response = openai_client.chat.completions.create(
            model="gpt-4o-mini",
            messages=build_prompt(query, docs),
            extra_headers=span.trace_headers(),  # links proxy log to this span
        )
        usage = response.usage
        span.end(
            output=response.choices[0].message.content,
            prompt_tokens=usage.prompt_tokens,
            completion_tokens=usage.completion_tokens,
            total_tokens=usage.total_tokens,
        )

When a span / trace context manager exits with an exception, the span is automatically marked error with the exception message.

Helper: observe_openai

Boilerplate-free version of the LLM span — auto-injects trace headers, auto-parses usage, and auto-ends the span:

from spanlens import observe_openai

result = observe_openai(trace, "answer", lambda headers:
    openai_client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages,
        extra_headers=headers,
    )
)

The same shape exists for Anthropic (observe_anthropic) and Gemini (observe_gemini).

Async support

observe() and observe_*() detect coroutines automatically. Pass an async callable and await the result:

async def go():
    result = await observe_openai(trace, "answer", lambda h:
        async_openai.chat.completions.create(..., extra_headers=h),
    )

Configuration reference

SpanlensClient(
    api_key="sl_live_...",        # required
    base_url=None,                 # default: https://spanlens-server.vercel.app
    timeout_ms=3000,               # ingest timeout per call
    silent=True,                   # swallow errors so observability never crashes user code
    on_error=None,                 # callback (err, context) for non-silent monitoring
)

Environment variables:

  • SPANLENS_API_KEY — picked up by create_openai(), create_anthropic(), create_gemini() when api_key= is omitted.

Why the SDK is non-blocking

Every trace.end() / span.end() call returns immediately. Network I/O runs on a background thread pool with a configurable timeout, so:

  • Your hot path (the LLM call itself) is never slowed down.
  • The Spanlens server being slow / down does not crash your app.
  • Order is still preserved: a span POST always waits for its parent trace POST to finish — the server's ownership check would otherwise 404 and the span would be silently lost.

For short-lived scripts, call client.close() before exit (or use with SpanlensClient(...) as client:) to drain the queue.


Compatibility

  • Python 3.9, 3.10, 3.11, 3.12, 3.13
  • openai >= 1.0
  • anthropic >= 0.18
  • google-generativeai >= 0.5

Self-hosting

Point the SDK and proxy helpers at your own deployment:

client = SpanlensClient(
    api_key="...",
    base_url="https://spanlens.mycompany.com",
)

openai = create_openai(base_url="https://spanlens.mycompany.com/proxy/openai/v1")

License

MIT — see LICENSE.

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spanlens-0.4.0.tar.gz (22.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spanlens-0.4.0-py3-none-any.whl (30.4 kB view details)

Uploaded Python 3

File details

Details for the file spanlens-0.4.0.tar.gz.

File metadata

  • Download URL: spanlens-0.4.0.tar.gz
  • Upload date:
  • Size: 22.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for spanlens-0.4.0.tar.gz
Algorithm Hash digest
SHA256 0cece5b062fe8473d006828464c5404445d596c35741488fa6fd07c90d2ed2d2
MD5 9d509e29aa6bf342af18fb5789ce40e6
BLAKE2b-256 f889966c9851e17a68a5126b8f91d02b8e7f2b5b125108d8233af96e2b81d413

See more details on using hashes here.

File details

Details for the file spanlens-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: spanlens-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 30.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for spanlens-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9ec1956903b8d6adaf9e4ddb20c01949330b2f5aaa82578d71b4ca940f40c186
MD5 a07c647cc07a32955dd3e27d72d8dc62
BLAKE2b-256 a8b9bc067f423ff59bbc6b6de30903bb6ff0a6d2338e9bc7a1f5d5a3d031a4f5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page