Skip to main content

Lightweight SDK for LLM inference logging and observability

Project description

llm-obs

Lightweight Python SDK for LLM inference logging and observability.

Auto-instruments OpenAI, Anthropic, Google Gemini, AWS Bedrock, Ollama, and any OpenAI-compatible endpoint — zero changes to your LLM call code.


Install

# Core SDK
pip install llm-obs

# With provider extras
pip install "llm-obs[openai]"
pip install "llm-obs[anthropic]"
pip install "llm-obs[gemini]"
pip install "llm-obs[bedrock]"
pip install "llm-obs[all]"

Quickstart — one line

from llm_obs import ObservabilityClient

obs = ObservabilityClient(
    endpoint="http://localhost:4000",   # your ingestion API
    api_key="dev-key",
)
obs.auto_instrument()   # patches all installed LLM libraries automatically

From this point, every LLM call in your app is logged automatically. No other changes needed.


Stream chat

from llm_obs import stream_chat, set_obs_context

# Set conversation context (picked up automatically by the SDK)
set_obs_context(conversation_id="conv-123")

# Unified streaming across all providers
async for chunk in stream_chat(provider="openai", model="gpt-4o-mini", messages=[
    {"role": "user", "content": "Explain Redis in one sentence."}
]):
    print(chunk, end="", flush=True)

Provider detection from URL

from llm_obs import detect_provider, available_providers
import os

os.environ["LLM_ENDPOINTS"] = "http://localhost:11434"  # Ollama, vLLM, or any URL

# SDK probes the URL and detects what's running
providers = available_providers()
# → {"ollama": ["gemma3:4b", "llama3.2", ...]}

Supported URL detection:

  • Ollama — detected via GET /api/tags
  • vLLM / LiteLLM / LocalAI — detected via GET /v1/models
  • AWS Bedrock — detected from URL pattern (amazonaws.com)
  • OpenAI / Anthropic / Google — detected from known API URL patterns
  • Private VPC — probed automatically

What gets logged per call

Field Description
provider / model Who served the request
latency_ms Total wall-clock time
ttft_ms Time-to-first-token (streaming)
prompt_tokens / completion_tokens Token usage
cost_usd Computed from built-in price table
status success, error, cancelled
request / response PII-redacted payloads
conversation_id Linked via set_obs_context()

PII redaction

PII is redacted in-process before data leaves via HTTP — email, phone, SSN, credit cards (Luhn), API keys, IPv4, URL secrets.

obs = ObservabilityClient(..., redact_pii=True)   # default: True

Manual span

span = obs.start_span(
    provider="openai",
    model="gpt-4o-mini",
    request={"messages": [{"role": "user", "content": "Hello"}]},
    conversation_id="conv-123",
)
span.set_ttft(ms=210)
span.set_usage(prompt_tokens=42, completion_tokens=11)
span.end(status="success", streamed=True)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_obs-0.1.1.tar.gz (22.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_obs-0.1.1-py3-none-any.whl (28.8 kB view details)

Uploaded Python 3

File details

Details for the file llm_obs-0.1.1.tar.gz.

File metadata

  • Download URL: llm_obs-0.1.1.tar.gz
  • Upload date:
  • Size: 22.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for llm_obs-0.1.1.tar.gz
Algorithm Hash digest
SHA256 d4b41710f85251b2003f2e4e6944a0b7f3b3866021e157e90e9b98bd4c6b0d29
MD5 6b216216b65a67c3e458a47b2224900e
BLAKE2b-256 48998e5a210ed808aa3a191870433e9a5f1adbf8ccb351d3bd64f48ee79defa0

See more details on using hashes here.

File details

Details for the file llm_obs-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: llm_obs-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 28.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for llm_obs-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 dd6d6e95b829f0fa2295c97623e829730ac04af4d4e28e04578782feac797e14
MD5 257dc212bbacc73e670305d118021022
BLAKE2b-256 af4e5b5a95dfeaa2f112489b777a5241e23499e076b63ff6f6ab5b54996de6e2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page