Skip to main content

Lightweight SDK for LLM inference logging and observability

Project description

llm-obs

Lightweight Python SDK for LLM inference logging and observability.

Auto-instruments OpenAI, Anthropic, Google Gemini, AWS Bedrock, Ollama, and any OpenAI-compatible endpoint — zero changes to your LLM call code.


Install

# Core SDK
pip install llm-obs

# With provider extras
pip install "llm-obs[openai]"
pip install "llm-obs[anthropic]"
pip install "llm-obs[gemini]"
pip install "llm-obs[bedrock]"
pip install "llm-obs[all]"

Quickstart — one line

from llm_obs import ObservabilityClient

obs = ObservabilityClient(
    endpoint="http://localhost:4000",   # your ingestion API
    api_key="dev-key",
)
obs.auto_instrument()   # patches all installed LLM libraries automatically

From this point, every LLM call in your app is logged automatically. No other changes needed.


Stream chat

from llm_obs import stream_chat, set_obs_context

# Set conversation context (picked up automatically by the SDK)
set_obs_context(conversation_id="conv-123")

# Unified streaming across all providers
async for chunk in stream_chat(provider="openai", model="gpt-4o-mini", messages=[
    {"role": "user", "content": "Explain Redis in one sentence."}
]):
    print(chunk, end="", flush=True)

Provider detection from URL

from llm_obs import detect_provider, available_providers
import os

os.environ["LLM_ENDPOINTS"] = "http://localhost:11434"  # Ollama, vLLM, or any URL

# SDK probes the URL and detects what's running
providers = available_providers()
# → {"ollama": ["gemma3:4b", "llama3.2", ...]}

Supported URL detection:

  • Ollama — detected via GET /api/tags
  • vLLM / LiteLLM / LocalAI — detected via GET /v1/models
  • AWS Bedrock — detected from URL pattern (amazonaws.com)
  • OpenAI / Anthropic / Google — detected from known API URL patterns
  • Private VPC — probed automatically

What gets logged per call

Field Description
provider / model Who served the request
latency_ms Total wall-clock time
ttft_ms Time-to-first-token (streaming)
prompt_tokens / completion_tokens Token usage
cost_usd Computed from built-in price table
status success, error, cancelled
request / response PII-redacted payloads
conversation_id Linked via set_obs_context()

PII redaction

PII is redacted in-process before data leaves via HTTP — email, phone, SSN, credit cards (Luhn), API keys, IPv4, URL secrets.

obs = ObservabilityClient(..., redact_pii=True)   # default: True

Manual span

span = obs.start_span(
    provider="openai",
    model="gpt-4o-mini",
    request={"messages": [{"role": "user", "content": "Hello"}]},
    conversation_id="conv-123",
)
span.set_ttft(ms=210)
span.set_usage(prompt_tokens=42, completion_tokens=11)
span.end(status="success", streamed=True)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_obs-0.1.0.tar.gz (21.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_obs-0.1.0-py3-none-any.whl (27.1 kB view details)

Uploaded Python 3

File details

Details for the file llm_obs-0.1.0.tar.gz.

File metadata

  • Download URL: llm_obs-0.1.0.tar.gz
  • Upload date:
  • Size: 21.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for llm_obs-0.1.0.tar.gz
Algorithm Hash digest
SHA256 5df9076db658793140e5238ae17e7d0b62475f7dab9831ec786e6c3eb5ddc728
MD5 150b46840c151fa8afb01f92d81dff5f
BLAKE2b-256 d3e8459a34d99bec9a54b6575092e5f57a5471c24672751a3a9ecbc5024ca525

See more details on using hashes here.

File details

Details for the file llm_obs-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: llm_obs-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 27.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for llm_obs-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7879fd000ccc232940de197a07f801303fd6f2040b4169d1935a5d4cdc54c0d9
MD5 c8b4085f850ec2c4070d7f7faea8a7a8
BLAKE2b-256 bed5e8d68824df148cbf634e750886a694440560a0950944a264064c1868d3e7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page