Spanlens SDK for Python. Agent tracing, LLM usage capture, and cost observability.
Project description
Spanlens Python SDK
LLM observability for Python. Trace agent runs, capture token usage and cost, and link calls back to your Spanlens dashboard with one line of code.
Spanlens is the open-source LLM observability platform. This is the official Python SDK. For the dashboard, signup, and proxy docs, head to spanlens.io.
Install
pip install spanlens
# Or with provider integrations:
pip install "spanlens[openai]"
pip install "spanlens[anthropic]"
pip install "spanlens[gemini]"
pip install "spanlens[langchain]"
pip install "spanlens[all]"
Two ways to use it
| Mode | Best for | Setup |
|---|---|---|
| Proxy | Single-call observability, drop-in for the OpenAI/Anthropic SDK | Replace base_url |
| SDK tracing | Multi-step agents, RAG, tool calls, manual spans | SpanlensClient(...) |
You can mix both. The proxy logs the raw request; the SDK groups multiple requests into a single trace with parent / child spans.
Mode 1. Proxy (zero-code)
Get a Spanlens API key from your dashboard, then point your provider SDK at the Spanlens proxy:
import os
from spanlens.integrations.openai import create_openai
# Reads SPANLENS_API_KEY from the environment
client = create_openai()
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello!"}],
)
Spanlens automatically logs the request, response, latency, token counts, and cost. View them in the dashboard under Requests.
Async (FastAPI, Django async views, asyncio)
Mirror helpers return the async client:
from spanlens.integrations.openai import create_async_openai
from spanlens.integrations.anthropic import create_async_anthropic
async def handler() -> str:
client = create_async_openai() # openai.AsyncOpenAI
resp = await client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello!"}],
)
return resp.choices[0].message.content
The SDK's background ingest pool is thread-safe; you can fan out asyncio.gather
of 50+ concurrent spans and trace/span POST ordering is preserved.
Tagging requests with a prompt version
from spanlens.integrations.openai import create_openai, with_prompt_version
client = create_openai()
res = client.chat.completions.create(
model="gpt-4o-mini",
messages=[...],
**with_prompt_version("chatbot-system@3"),
)
The same pattern works for Anthropic. See
spanlens.integrations.anthropic.
Mode 2. SDK tracing (multi-step agents)
Use the SDK when one user request spans multiple LLM calls, retrieval, tool use, etc. Spans appear nested under a single trace in the dashboard.
from spanlens import SpanlensClient
client = SpanlensClient(api_key="sl_live_...")
with client.start_trace("rag_pipeline", metadata={"user_id": "u_42"}) as trace:
with trace.span("retrieve", span_type="retrieval") as span:
docs = vector_store.similarity_search(query, k=5)
span.end(output={"doc_count": len(docs)})
with trace.span("generate", span_type="llm") as span:
response = openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=build_prompt(query, docs),
extra_headers=span.trace_headers(), # links proxy log to this span
)
usage = response.usage
span.end(
output=response.choices[0].message.content,
prompt_tokens=usage.prompt_tokens,
completion_tokens=usage.completion_tokens,
total_tokens=usage.total_tokens,
)
When a span / trace context manager exits with an exception, the span is
automatically marked error with the exception message.
Helper: observe_openai
Boilerplate-free version of the LLM span. Auto-injects trace headers,
auto-parses usage, and auto-ends the span:
from spanlens import observe_openai
result = observe_openai(trace, "answer", lambda headers:
openai_client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
extra_headers=headers,
)
)
The same shape exists for Anthropic (observe_anthropic) and Gemini
(observe_gemini).
Async support
observe() and observe_*() detect coroutines automatically. Pass an async
callable and await the result:
async def go():
result = await observe_openai(trace, "answer", lambda h:
async_openai.chat.completions.create(..., extra_headers=h),
)
Ollama (local LLMs)
observe_ollama() traces calls against a local Ollama instance. Use the OpenAI client pointed at Ollama's OpenAI-compatible endpoint, then wrap with the helper so the dashboard tags the span as provider: "ollama" instead of OpenAI:
from openai import OpenAI
from spanlens import SpanlensClient, observe_ollama
client = SpanlensClient(api_key="sl_live_...")
ollama = OpenAI(
base_url="http://localhost:11434/v1",
api_key="ollama", # ignored by Ollama; required by the openai SDK
)
with client.start_trace("local_summarize") as trace:
result = observe_ollama(trace, "llama3_summary", lambda h:
ollama.chat.completions.create(
model="llama3.1",
messages=[{"role": "user", "content": "Summarize: ..."}],
extra_headers=h,
),
)
Cost is left as None because Ollama is self-hosted, so there is no per-token bill to compute.
LangChain / LangGraph
SpanlensCallbackHandler plugs into LangChain's standard BaseCallbackHandler
contract, so it works for plain LangChain chains, LCEL pipelines, and
LangGraph compiled graphs without code changes. Every LLM / chain / tool /
retriever node becomes a span with the run-id tree mirroring the graph
topology.
from spanlens import SpanlensClient
from spanlens.integrations.langchain import SpanlensCallbackHandler
client = SpanlensClient(api_key="sl_live_...")
handler = SpanlensCallbackHandler(client=client)
# LangChain / LCEL
result = chain.invoke({"input": "Hello"}, config={"callbacks": [handler]})
# LangGraph
graph = workflow.compile()
result = graph.invoke({"input": "Hello"}, config={"callbacks": [handler]})
Attach to an existing trace to nest the chain under a larger workflow:
with client.start_trace("agent_run") as trace:
handler = SpanlensCallbackHandler(client=client, trace=trace)
chain.invoke({"input": "..."}, config={"callbacks": [handler]})
# ... other steps in the same trace ...
The handler depends on langchain-core at runtime. Either install the
spanlens[langchain] extra above, or any LangChain extras you already use
will bring it in.
Configuration reference
SpanlensClient(
api_key="sl_live_...", # required
base_url=None, # default: https://spanlens-server.vercel.app
timeout_ms=3000, # ingest timeout per call
silent=True, # swallow errors so observability never crashes user code
on_error=None, # callback (err, context) for non-silent monitoring
)
Environment variables:
SPANLENS_API_KEYis picked up bycreate_openai(),create_anthropic(), andcreate_gemini()whenapi_key=is omitted.
Why the SDK is non-blocking
Every trace.end() / span.end() call returns immediately. Network I/O
runs on a background thread pool with a configurable timeout, so:
- Your hot path (the LLM call itself) is never slowed down.
- The Spanlens server being slow / down does not crash your app.
- Order is still preserved: a span POST always waits for its parent trace POST to finish, because the server's ownership check would otherwise 404 and the span would be silently lost.
For short-lived scripts, call client.close() before exit (or use
with SpanlensClient(...) as client:) to drain the queue.
Compatibility
- Python 3.9, 3.10, 3.11, 3.12, 3.13
openai>= 1.0anthropic>= 0.18google-generativeai>= 0.5
Self-hosting
Point the SDK and proxy helpers at your own deployment:
client = SpanlensClient(
api_key="...",
base_url="https://spanlens.mycompany.com",
)
openai = create_openai(base_url="https://spanlens.mycompany.com/proxy/openai/v1")
License
MIT. See LICENSE.
Links
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file spanlens-0.5.1.tar.gz.
File metadata
- Download URL: spanlens-0.5.1.tar.gz
- Upload date:
- Size: 27.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2be9403264931857e839963f441f2b2bd18ac4a9326f751ae328eba265bddb94
|
|
| MD5 |
1794418b1991aa1a68d1583e0c4851fd
|
|
| BLAKE2b-256 |
9c1711ac2954c319b50dfc8cb714b2767268164812e120b0ad66ca0ee59ab226
|
File details
Details for the file spanlens-0.5.1-py3-none-any.whl.
File metadata
- Download URL: spanlens-0.5.1-py3-none-any.whl
- Upload date:
- Size: 35.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a2ec1ad31ef0bd82631e4aa73a7eb03eff889c749c3d8c14ed5e2459cb2f9077
|
|
| MD5 |
d43e4b7685c6c76a78f1f38ee13640df
|
|
| BLAKE2b-256 |
8be680deae4ed50883fc5274fddaf90a6f08986df871005df22185fc847d00f6
|