Skip to main content

Lightweight SDK for LLM inference logging and observability

Project description

llm-obs

Lightweight Python SDK for LLM inference logging and observability.

Auto-instruments OpenAI, Anthropic, Google Gemini, AWS Bedrock, Ollama, and any OpenAI-compatible endpoint — zero changes to your LLM call code.


Install

# Core SDK
pip install llm-obs

# With provider extras
pip install "llm-obs[openai]"
pip install "llm-obs[anthropic]"
pip install "llm-obs[gemini]"
pip install "llm-obs[bedrock]"
pip install "llm-obs[all]"

Quickstart — one line

from llm_obs import ObservabilityClient

obs = ObservabilityClient(
    endpoint="http://localhost:4000",   # your ingestion API
    api_key="dev-key",
)
obs.auto_instrument()   # patches all installed LLM libraries automatically

From this point, every LLM call in your app is logged automatically. No other changes needed.


Stream chat

from llm_obs import stream_chat, set_obs_context

# Set conversation context (picked up automatically by the SDK)
set_obs_context(conversation_id="conv-123")

# Unified streaming across all providers
async for chunk in stream_chat(provider="openai", model="gpt-4o-mini", messages=[
    {"role": "user", "content": "Explain Redis in one sentence."}
]):
    print(chunk, end="", flush=True)

Provider detection from URL

from llm_obs import detect_provider, available_providers
import os

os.environ["LLM_ENDPOINTS"] = "http://localhost:11434"  # Ollama, vLLM, or any URL

# SDK probes the URL and detects what's running
providers = available_providers()
# → {"ollama": ["gemma3:4b", "llama3.2", ...]}

Supported URL detection:

  • Ollama — detected via GET /api/tags
  • vLLM / LiteLLM / LocalAI — detected via GET /v1/models
  • AWS Bedrock — detected from URL pattern (amazonaws.com)
  • OpenAI / Anthropic / Google — detected from known API URL patterns
  • Private VPC — probed automatically

What gets logged per call

Field Description
provider / model Who served the request
latency_ms Total wall-clock time
ttft_ms Time-to-first-token (streaming)
prompt_tokens / completion_tokens Token usage
cost_usd Computed from built-in price table
status success, error, cancelled
request / response PII-redacted payloads
conversation_id Linked via set_obs_context()

PII redaction

PII is redacted in-process before data leaves via HTTP — email, phone, SSN, credit cards (Luhn), API keys, IPv4, URL secrets.

obs = ObservabilityClient(..., redact_pii=True)   # default: True

Manual span

span = obs.start_span(
    provider="openai",
    model="gpt-4o-mini",
    request={"messages": [{"role": "user", "content": "Hello"}]},
    conversation_id="conv-123",
)
span.set_ttft(ms=210)
span.set_usage(prompt_tokens=42, completion_tokens=11)
span.end(status="success", streamed=True)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_obs-0.1.2.tar.gz (24.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_obs-0.1.2-py3-none-any.whl (30.2 kB view details)

Uploaded Python 3

File details

Details for the file llm_obs-0.1.2.tar.gz.

File metadata

  • Download URL: llm_obs-0.1.2.tar.gz
  • Upload date:
  • Size: 24.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for llm_obs-0.1.2.tar.gz
Algorithm Hash digest
SHA256 fe3ff64d92ee9120233ec2ea2ad1d50635f9a1e88b974750f7c95f285e2eedc6
MD5 d9e566593418a8ec42b84a1ac4a04bc9
BLAKE2b-256 98a33ddc685cc842ea16c15f7ea7a1a27fad2045901cf98a6e74bffb666f154e

See more details on using hashes here.

File details

Details for the file llm_obs-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: llm_obs-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 30.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for llm_obs-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 253d5fa7b1ee9dc31e21de4cf33fc09c3ac2e9a6aafd78ab11829238dd9df758
MD5 64de373fba4ed1ed5de48b40c9844774
BLAKE2b-256 8c52c9e2702d9516bb203dbcd0cbb050dd52cb48091793f5865130ae26aa09cb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page