Skip to main content

OpenSearch AI Observability SDK — OTEL-native tracing and scoring for LLM applications

Project description

OpenSearch GenAI SDK

OTEL-native tracing and scoring for LLM applications. Instrument your AI workflows with standard OpenTelemetry spans and submit evaluation scores — all routed to OpenSearch through a single OTLP pipeline.

Features

  • One-line setupregister() configures the full OTEL pipeline (TracerProvider, exporter, auto-instrumentation)
  • Decorators@workflow, @task, @agent, @tool wrap functions as OTEL spans with GenAI semantic convention attributes
  • Auto-instrumentation — automatically discovers and activates installed instrumentor packages (OpenAI, Anthropic, Bedrock, LangChain, etc.)
  • Scoringscore() emits evaluation metrics as OTEL spans at span, trace, or session level
  • AWS SigV4 — built-in SigV4 signing for AWS-hosted OpenSearch and Data Prepper endpoints
  • Zero lock-in — remove a decorator and your code still works; everything is standard OTEL

Requirements

  • Python: 3.10, 3.11, 3.12, or 3.13
  • OpenTelemetry SDK: ≥1.20.0, <2

Installation

pip install opensearch-genai-sdk-py

The core package includes the OTEL SDK and exporters. Auto-instrumentation of LLM libraries is opt-in — install only the providers you use:

# Single provider
pip install opensearch-genai-sdk-py[openai]
pip install opensearch-genai-sdk-py[anthropic]
pip install opensearch-genai-sdk-py[bedrock]
pip install opensearch-genai-sdk-py[langchain]

# Multiple providers
pip install "opensearch-genai-sdk-py[openai,anthropic]"

# All instrumentors at once
pip install opensearch-genai-sdk-py[instrumentors]

# AWS SigV4 signing for OpenSearch Ingestion / OpenSearch Service
pip install opensearch-genai-sdk-py[aws]

# Everything
pip install opensearch-genai-sdk-py[all]

Available extras: openai, anthropic, cohere, mistral, groq, ollama, google, bedrock, langchain, llamaindex, instrumentors (all of the above), aws, all

Quick Start

from opensearch_genai_sdk_py import register, workflow, agent, tool, score

# 1. Initialize tracing (one line)
register(endpoint="http://localhost:4318/v1/traces")

# 2. Decorate your functions
@tool("get_weather")
def get_weather(city: str) -> dict:
    """Fetch weather data for a city."""
    return {"city": city, "temp": 22, "condition": "sunny"}

@agent("weather_assistant")
def assistant(query: str) -> str:
    data = get_weather("Paris")
    return f"{data['condition']}, {data['temp']}C"

@workflow("weather_query")
def run(query: str) -> str:
    return assistant(query)

result = run("What's the weather?")

# 3. Submit scores (after workflow completes)
score(name="relevance", value=0.95, trace_id="...", source="llm-judge")

Architecture

┌─────────────────────────────────────────────────────┐
│                  Your Application                    │
│                                                      │
│  @workflow ─→ @agent ─→ @tool    score()            │
│     │            │         │        │                │
│     └────────────┴─────────┴────────┘                │
│                     │                                │
│            opensearch-genai-sdk-py                    │
├─────────────────────────────────────────────────────┤
│  register()                                          │
│  ┌─────────────────────────────────────────────┐    │
│  │  TracerProvider                              │    │
│  │  ├── Resource (service.name)                 │    │
│  │  ├── BatchSpanProcessor                      │    │
│  │  │   └── OTLPSpanExporter (HTTP or gRPC)     │    │
│  │  │       └── SigV4 signing (AWS endpoints)   │    │
│  │  └── Auto-instrumentation                    │    │
│  │      ├── openai, anthropic, bedrock, ...     │    │
│  │      ├── langchain, llamaindex, haystack     │    │
│  │      └── chromadb, pinecone, qdrant, ...     │    │
│  └─────────────────────────────────────────────┘    │
└──────────────────────┬──────────────────────────────┘
                       │ OTLP (HTTP/gRPC)
                       ▼
              ┌─────────────────┐
              │  Data Prepper /  │
              │  OTEL Collector  │
              └────────┬────────┘
                       │
                       ▼
              ┌─────────────────┐
              │   OpenSearch     │
              │  ├── traces      │
              │  └── scores      │
              └─────────────────┘

API Reference

register()

Configures the OTEL tracing pipeline. Call once at startup.

register(
    endpoint="https://pipeline.us-east-1.osis.amazonaws.com/v1/traces",
    service_name="my-app",
    auth="auto",           # "auto"/"none" = plain (default) | "sigv4" = AWS SigV4
    batch=True,            # BatchSpanProcessor (True) or Simple (False)
    auto_instrument=True,  # discover installed instrumentor packages
)

Endpoint formats:

URL scheme Transport
http:// / https:// HTTP (default)
grpc:// gRPC (insecure)
grpcs:// gRPC (TLS)

Auth:

  • auth="auto" (default) — auto-detects AWS endpoints (*.amazonaws.com) and enables SigV4; uses plain HTTP for everything else.
  • auth="sigv4" — always use SigV4 (requires pip install opensearch-genai-sdk-py[aws]).
  • auth="none" — always plain HTTP, no signing.

Decorators

Four decorators for tracing application logic. Each creates an OTEL span with gen_ai.* semantic convention attributes.

Decorator Use for Operation name Span name format
@workflow("name") Top-level orchestration invoke_agent name
@task("name") Units of work invoke_agent name
@agent("name") Autonomous agent logic invoke_agent invoke_agent name
@tool("name") Tool/function calls execute_tool execute_tool name

All decorators accept name (defaults to function's __qualname__) and version.

Attributes set automatically:

Attribute Set by
gen_ai.operation.name All decorators
gen_ai.agent.name / gen_ai.tool.name All decorators
gen_ai.input.messages / gen_ai.output.messages @workflow, @task, @agent
gen_ai.tool.call.arguments / gen_ai.tool.call.result @tool
gen_ai.tool.type @tool (always "function")
gen_ai.tool.description @tool (from docstring, if present)
gen_ai.agent.version All decorators (when version is set)

Supported function types: sync, async, generators, async generators. Errors are captured as span status + exception events.

@agent("research_agent", version=2)
async def research(query: str) -> str:
    """Agents create invoke_agent spans with gen_ai.agent.* attributes."""
    result = await search_tool(query)
    return summarize(result)

@tool("search")
def search_tool(query: str) -> list:
    """Docstring becomes gen_ai.tool.description. Input/output use gen_ai.tool.call.* attributes."""
    return api.search(query)

score()

Submits evaluation scores as OTEL spans. Use any evaluation framework you prefer (autoevals, RAGAS, custom) and submit the results through score().

Three scoring levels:

# Span-level: score a specific LLM call or tool execution
score(
    name="accuracy",
    value=0.95,
    trace_id="abc123",
    span_id="def456",
    explanation="Weather data matches ground truth",
    source="heuristic",
)

# Trace-level: score an entire workflow
score(
    name="relevance",
    value=0.92,
    trace_id="abc123",
    explanation="Response addresses the user's query",
    source="llm-judge",
)

# Session-level: score across multiple traces in a conversation
score(
    name="user_satisfaction",
    value=0.88,
    conversation_id="session-123",
    label="satisfied",
    source="human",
)

Parameters:

Parameter Type Description
name str Metric name (e.g., "relevance", "factuality")
value float Numeric score
trace_id str Trace being scored (span/trace-level)
span_id str Span being scored (span-level)
conversation_id str Session being scored (session-level)
label str Human-readable label ("pass", "relevant")
explanation str Evaluator justification (truncated to 500 chars)
response_id str LLM completion ID for correlation
source str Score origin: "sdk", "human", "llm-judge", "heuristic"
metadata dict Arbitrary key-value metadata

Scores are emitted as gen_ai.evaluation.result spans with gen_ai.evaluation.* attributes, following the OTEL GenAI semantic conventions.

Auto-Instrumented Libraries

register() automatically discovers and activates any installed instrumentor packages via OTEL entry points. No code changes needed — install the extras for the providers you use and their calls are traced automatically.

Category Extras / packages
LLM providers [openai], [anthropic], [cohere], [mistral], [groq], [ollama]
Cloud AI [bedrock], [google] (Vertex AI + Generative AI)
Frameworks [langchain], [llamaindex]
All of the above [instrumentors]

Additional instrumentors (Together, Replicate, Writer, Voyage AI, Aleph Alpha, SageMaker, watsonx, Haystack, CrewAI, Agno, MCP, Transformers, ChromaDB, Pinecone, Qdrant, Weaviate, Milvus, LanceDB, Marqo) are included in [instrumentors] but do not have individual extras.

Configuration

Environment Variable Description Default
OPENSEARCH_OTEL_ENDPOINT OTLP endpoint URL http://localhost:21890/opentelemetry/v1/traces
OTEL_SERVICE_NAME Service name for spans "default"
OPENSEARCH_PROJECT Project/service name (fallback) "default"
AWS_DEFAULT_REGION AWS region for SigV4 auto-detected

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

opensearch_genai_sdk_py-0.2.4.tar.gz (26.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

opensearch_genai_sdk_py-0.2.4-py3-none-any.whl (20.4 kB view details)

Uploaded Python 3

File details

Details for the file opensearch_genai_sdk_py-0.2.4.tar.gz.

File metadata

  • Download URL: opensearch_genai_sdk_py-0.2.4.tar.gz
  • Upload date:
  • Size: 26.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for opensearch_genai_sdk_py-0.2.4.tar.gz
Algorithm Hash digest
SHA256 32c02e929d7c9322ddd308aa8a8d44323e95fcb49be3dac57d08905af229843e
MD5 e9901a0e6b77b637981b28872b47fbed
BLAKE2b-256 0bcab987f83c82606d298dd758a7ec41bcec199e76f017b1bbbeba80858292b5

See more details on using hashes here.

Provenance

The following attestation bundles were made for opensearch_genai_sdk_py-0.2.4.tar.gz:

Publisher: release.yml on vamsimanohar/opensearch-genai-sdk-py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file opensearch_genai_sdk_py-0.2.4-py3-none-any.whl.

File metadata

File hashes

Hashes for opensearch_genai_sdk_py-0.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 59e6d0c89c9d46e9919fc977f1f42328fde7bb6baa50bae8cc066079c4338162
MD5 bb236c7ec8749a4f5d955900bc64fe9d
BLAKE2b-256 791cc30c67ecf33789ad859609698cf86a80ce5efe479cf14b26160d2636902d

See more details on using hashes here.

Provenance

The following attestation bundles were made for opensearch_genai_sdk_py-0.2.4-py3-none-any.whl:

Publisher: release.yml on vamsimanohar/opensearch-genai-sdk-py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page