Skip to main content

Lightweight SDK for LLM inference logging and observability

Project description

llm-obs

Lightweight Python SDK for LLM inference logging and observability.

Auto-instruments OpenAI, Anthropic, Google Gemini, AWS Bedrock, Ollama, and any OpenAI-compatible endpoint — zero changes to your LLM call code.


Install

# Core SDK
pip install llm-obs

# With provider extras
pip install "llm-obs[openai]"
pip install "llm-obs[anthropic]"
pip install "llm-obs[gemini]"
pip install "llm-obs[bedrock]"
pip install "llm-obs[all]"

Quickstart — one line

from llm_obs import ObservabilityClient

obs = ObservabilityClient(
    endpoint="http://localhost:4000",   # your ingestion API
    api_key="dev-key",
)
obs.auto_instrument()   # patches all installed LLM libraries automatically

From this point, every LLM call in your app is logged automatically. No other changes needed.


Stream chat

from llm_obs import stream_chat, set_obs_context

# Set conversation context (picked up automatically by the SDK)
set_obs_context(conversation_id="conv-123")

# Unified streaming across all providers
async for chunk in stream_chat(provider="openai", model="gpt-4o-mini", messages=[
    {"role": "user", "content": "Explain Redis in one sentence."}
]):
    print(chunk, end="", flush=True)

Provider detection from URL

from llm_obs import detect_provider, available_providers
import os

os.environ["LLM_ENDPOINTS"] = "http://localhost:11434"  # Ollama, vLLM, or any URL

# SDK probes the URL and detects what's running
providers = available_providers()
# → {"ollama": ["gemma3:4b", "llama3.2", ...]}

Supported URL detection:

  • Ollama — detected via GET /api/tags
  • vLLM / LiteLLM / LocalAI — detected via GET /v1/models
  • AWS Bedrock — detected from URL pattern (amazonaws.com)
  • OpenAI / Anthropic / Google — detected from known API URL patterns
  • Private VPC — probed automatically

What gets logged per call

Field Description
provider / model Who served the request
latency_ms Total wall-clock time
ttft_ms Time-to-first-token (streaming)
prompt_tokens / completion_tokens Token usage
cost_usd Computed from built-in price table
status success, error, cancelled
request / response PII-redacted payloads
conversation_id Linked via set_obs_context()

PII redaction

PII is redacted in-process before data leaves via HTTP — email, phone, SSN, credit cards (Luhn), API keys, IPv4, URL secrets.

obs = ObservabilityClient(..., redact_pii=True)   # default: True

Manual span

span = obs.start_span(
    provider="openai",
    model="gpt-4o-mini",
    request={"messages": [{"role": "user", "content": "Hello"}]},
    conversation_id="conv-123",
)
span.set_ttft(ms=210)
span.set_usage(prompt_tokens=42, completion_tokens=11)
span.end(status="success", streamed=True)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_obs-0.1.3.tar.gz (24.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_obs-0.1.3-py3-none-any.whl (30.2 kB view details)

Uploaded Python 3

File details

Details for the file llm_obs-0.1.3.tar.gz.

File metadata

  • Download URL: llm_obs-0.1.3.tar.gz
  • Upload date:
  • Size: 24.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for llm_obs-0.1.3.tar.gz
Algorithm Hash digest
SHA256 63bdb8288391b6d5dc04dc9c358d4b126209e3ba44b20016b502e2870e60b536
MD5 6ae4768695ad570a8f57657bd9d531e4
BLAKE2b-256 7d864ed90fd2fa1274e6707a94ab1d3b71e108974cf91f9127567808cc23359e

See more details on using hashes here.

File details

Details for the file llm_obs-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: llm_obs-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 30.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for llm_obs-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 95ac9970287574ca0b3dd5fc6ea99cda349c77be1c36e4795a6d57df25e5b6ff
MD5 8a55c99eaecb6b500987f4189ad923f7
BLAKE2b-256 6f9bf93606fb91310ae3ae2ea72774a82f867b47c28f48c330c009cf5b9bf6a1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page