Last9 observability attributes for OpenTelemetry GenAI spans - track costs, workflows, and conversations in LLM applications

These details have not been verified by PyPI

Project links

Project description

Last9 GenAI - Python SDK

OpenTelemetry extension for LLM observability: track conversations, workflows, and costs

Overview

Track conversations and workflows in your LLM applications with automatic context propagation. Built on OpenTelemetry for seamless integration with your existing observability stack.

Not a replacement for OTel auto-instrumentation — works alongside it or standalone.

Key Features:

🎯 Conversation Tracking: Automatic multi-turn conversation tracking with conversation_context
🤖 Agent Tracking: First-class agent identity with agent_context (OTel gen_ai.agent.* semantic conventions)
🔄 Workflow Management: Track complex multi-step AI workflows with workflow_context
🎨 Zero-Touch Instrumentation: @observe() decorator for automatic tracking
📊 Context Propagation: Thread-safe attribute tracking across nested operations
💰 Optional Cost Tracking: Bring your own pricing for cost monitoring
🏷️ Span Classification: Filter by type (llm/tool/chain/agent/prompt)

Features

Core Tracking

🎯 Conversation Tracking: Multi-turn conversations with gen_ai.conversation.id and turn numbers
🤖 Agent Identity: Track agents with gen_ai.agent.id, gen_ai.agent.name, gen_ai.agent.version (OTel semantic conventions)
🔄 Workflow Management: Track multi-step AI operations across LLM calls, tools, and retrievals
📊 Auto-Context Propagation: Thread-safe context managers that automatically tag all nested operations
🎨 Decorator Pattern: @observe() for zero-touch instrumentation with full input/output/latency tracking
🔧 SpanProcessor: Automatic context enrichment for all spans in your application

Enhanced Observability

🏷️ Span Classification: gen_ai.l9.span.kind for filtering (llm/tool/chain/agent/prompt)
🛠️ Tool/Function Tracking: Enhanced attributes for function calls and tool usage
⚡ Performance Metrics: Response times, token counts, and quality scores
🌐 Provider Agnostic: Works with OpenAI, Anthropic, Google, Cohere, etc.
📏 Standard Attributes: Full OpenTelemetry gen_ai.* semantic conventions

Optional Features

💰 Cost Tracking: Bring your own model pricing for cost monitoring
💸 Workflow Costing: Aggregate costs across multi-step operations

Relationship to OpenTelemetry GenAI

This is an EXTENSION, not a replacement:

Package	Purpose	Approach
OTel GenAI `opentelemetry-instrumentation-openai-v2`	Auto-instrument LLM SDKs	Automatic (monkey-patching)
Last9 GenAI `last9-genai`	Add conversation/workflow tracking	Context-based enrichment

You can use:

Last9 GenAI alone - Full conversation and workflow tracking
Both together - OTel auto-traces + Last9 adds conversation/workflow context (recommended!)

See Working with OTel Auto-Instrumentation for combined usage.

Installation

Basic:

pip install last9-genai

With OTLP export (recommended):

pip install last9-genai[otlp]

Requirements:

Python 3.10+
opentelemetry-api>=1.20.0
opentelemetry-sdk>=1.20.0

Quick Start

Note: The examples below use client to represent your LLM client. Initialize your preferred provider:

# OpenAI
from openai import OpenAI
client = OpenAI()

# Or Anthropic
from anthropic import Anthropic
anthropic_client = Anthropic()

# Or any other provider (Google, Cohere, etc.)

The SDK works with any LLM provider - just use your client normally!

Track Conversations (Recommended)

Automatically track multi-turn conversations with zero manual instrumentation:

from last9_genai import conversation_context, Last9SpanProcessor
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider

# Setup tracing with Last9 processor
provider = TracerProvider()
trace.set_tracer_provider(provider)
provider.add_span_processor(Last9SpanProcessor())

# Track conversations automatically - works with any LLM provider
with conversation_context(conversation_id="session_123", user_id="user_456"):
    # OpenAI
    response1 = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Hello!"}]
    )

    # Anthropic (same context!)
    response2 = anthropic_client.messages.create(
        model="claude-sonnet-4",
        messages=[{"role": "user", "content": "How are you?"}]
    )
    # Both calls automatically have conversation_id = "session_123"!

Track Workflows

Track complex multi-step AI operations:

from last9_genai import workflow_context

# Track entire workflow with automatic tagging
with workflow_context(workflow_id="rag_search", workflow_type="retrieval"):
    # All operations automatically tagged with workflow_id
    docs = retrieve_documents(query)  # Tagged
    context = rerank_documents(docs)   # Tagged
    response = generate_answer(context) # Tagged
    # Full workflow visibility with zero manual instrumentation!

# Nest workflows and conversations
with conversation_context(conversation_id="support_123"):
    with workflow_context(workflow_id="order_lookup"):
        # Both conversation AND workflow tracked automatically
        result = lookup_and_respond()

Track Agents

Track agent identity using OTel GenAI semantic conventions (gen_ai.agent.*):

from last9_genai import agent_context

# Track agent identity — all child spans get gen_ai.agent.* attributes
with agent_context(agent_id="support_bot_v2", agent_name="Support Bot", agent_version="2.0"):
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Help me with my order"}]
    )
    # Span automatically has gen_ai.agent.id, gen_ai.agent.name, gen_ai.agent.version

# Nest with conversations for full context
with conversation_context(conversation_id="session_123", user_id="user_456"):
    with agent_context(agent_id="router_agent", agent_name="Router"):
        route = classify_intent(query)

    with agent_context(agent_id="support_agent", agent_name="Support"):
        response = handle_support(query)
    # Each agent's spans are tagged separately, both share the conversation

Decorator Pattern (Zero-Touch)

Use @observe() for automatic tracking of everything:

from last9_genai import observe

@observe()  # That's it!
def call_llm(prompt: str):
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}]
    )
    return response

# Automatically tracks:
# - Input (prompt)
# - Output (response)
# - Latency (span duration)
# - Context (conversation_id, workflow_id if set)

# Works seamlessly with context managers
with conversation_context(conversation_id="session_456"):
    response = call_llm("Explain quantum computing")
    # Span automatically has conversation_id!

Optional: Cost Tracking

Add cost monitoring by providing model pricing:

from last9_genai import ModelPricing

# Add pricing when creating processor
processor = Last9SpanProcessor(custom_pricing={
    "gpt-4o": ModelPricing(input=2.50, output=10.0),
    "claude-sonnet-4-5": ModelPricing(input=3.0, output=15.0),
})

# Or with decorator
pricing = {"gpt-4o": ModelPricing(input=2.50, output=10.0)}

@observe(pricing=pricing)
def call_llm(prompt: str):
    # Now also tracks cost automatically
    return client.chat.completions.create(...)

Decorator Pattern (Zero-Touch)

Use @observe() decorator for automatic tracking of input/output, latency, and cost:

from last9_genai import observe, ModelPricing

pricing = {"gpt-4o": ModelPricing(input=2.50, output=10.0)}

@observe(pricing=pricing)
def call_openai(prompt: str):
    """Automatically tracks everything!"""
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}]
    )
    return response

# That's it! Automatically tracks:
# - Input (prompt)
# - Output (response)
# - Latency (span duration)
# - Cost (calculated from usage)
# - Metadata (from context)

# Works with context too:
with conversation_context(conversation_id="session_123"):
    response = call_openai("Hello!")
    # Span automatically has conversation_id!

Tags and Categories

Add tags and categories for better filtering and organization in your observability platform:

from last9_genai import observe

@observe(
    tags=["production", "customer_support"],
    metadata={
        "category": "customer_support",  # Appears in Last9 dashboard Category column
        "version": "1.0.0",
        "priority": "high"
    }
)
def handle_support_query(query: str):
    """Categorized LLM call with metadata"""
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": query}]
    )
    return response

# Categories automatically appear in Last9 dashboard:
# - Category column in traces table
# - Category filter dropdown
# - Enhanced trace details

# Use underscores for multi-word categories:
@observe(metadata={"category": "data_analysis"})  # Shows as "data analysis"
def analyze_data(data: str):
    return client.chat.completions.create(...)

Common categories:

customer_support, conversational_ai, code_assistant
data_analysis, content_generation, summarization
translation, research, qa_automation

Working with OTel Auto-Instrumentation

Recommended: Combine OTel auto-instrumentation with Last9 extensions:

# Step 1: Auto-instrument with OpenTelemetry (standard attributes)
from opentelemetry.instrumentation.openai_v2 import OpenAIInstrumentor
OpenAIInstrumentor().instrument()

# Step 2: Add Last9 extensions (cost, workflows)
from last9_genai import Last9GenAI, ModelPricing

l9 = Last9GenAI(custom_pricing={
    "gpt-4o": ModelPricing(input=2.50, output=10.0),
})

# Now make LLM calls
from openai import OpenAI
client = OpenAI()

# OTel automatically traces this call (standard attributes)
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)

# Last9 adds cost on top of auto-traced span
from opentelemetry import trace
span = trace.get_current_span()
usage = {
    "input_tokens": response.usage.prompt_tokens,
    "output_tokens": response.usage.completion_tokens,
}
cost = l9.add_llm_cost_attributes(span, "gpt-4o", usage)
print(f"Cost: ${cost.total:.6f}")

Result: You get standard OTel attributes (automatic) + Last9 cost/workflow (manual).

Capturing Prompts, Completions, and Tool Calls

opentelemetry-instrumentation-openai-v2 (v2.x) follows the new OpenTelemetry GenAI semantic conventions and emits message content, tool calls, and completions as OTel log events, not as span attributes. The Last9 LLM dashboard reads span attributes / events, so without a bridge those payloads never reach the dashboard.

Last9LogToSpanProcessor listens to those log events and promotes their payloads onto the currently active span:

gen_ai.prompt (JSON array of prompt messages)
gen_ai.completion (JSON array of completion choices)
span events gen_ai.content.prompt / gen_ai.content.completion
indexed gen_ai.prompt.{i}.* / gen_ai.completion.{i}.* (AgentOps / Traceloop compatible)

from opentelemetry import trace, _logs
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk._logs import LoggerProvider
from opentelemetry.instrumentation.openai_v2 import OpenAIInstrumentor

from last9_genai import Last9SpanProcessor, Last9LogToSpanProcessor

log_bridge = Last9LogToSpanProcessor()

tracer_provider = TracerProvider()
tracer_provider.add_span_processor(Last9SpanProcessor(log_processor=log_bridge))
trace.set_tracer_provider(tracer_provider)

logger_provider = LoggerProvider()
logger_provider.add_log_record_processor(log_bridge)
_logs.set_logger_provider(logger_provider)

import os
os.environ["OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT"] = "true"
OpenAIInstrumentor().instrument(logger_provider=logger_provider)

After this, every LLM call instrumented by openai-v2 has its full prompt and completion content available on the span.

Python 3.14 users: pin wrapt<2. opentelemetry-instrumentation-openai-v2 2.3b0 calls wrap_function_wrapper(module=..., name=..., wrapper=...) and wrapt 2.0 renamed the first kwarg to target=. Without the pin, instrumentation fails silently and no log events are emitted.

Usage Examples

Multi-Turn Conversations

Track conversations across multiple turns automatically:

from last9_genai import conversation_context

# Track a complete conversation session
with conversation_context(conversation_id="support_session_456", user_id="user_456"):
    # Turn 1
    response1 = client.chat.completions.create(
        messages=[{"role": "user", "content": "I need help with my order"}]
    )

    # Turn 2
    response2 = client.chat.completions.create(
        messages=[
            {"role": "user", "content": "I need help with my order"},
            {"role": "assistant", "content": response1.choices[0].message.content},
            {"role": "user", "content": "Order #12345"}
        ]
    )

    # Both calls automatically tagged with:
    # - conversation_id = "support_session_456"
    # - user_id = "user_456"
    # All turns linked together for analysis!

Complex Workflows

Track multi-step AI workflows with automatic tagging:

from last9_genai import workflow_context

# RAG workflow example
with workflow_context(workflow_id="rag_pipeline", workflow_type="retrieval"):
    # Step 1: Query expansion (automatically tagged)
    expanded_query = expand_query(user_question)

    # Step 2: Retrieval (automatically tagged)
    documents = vector_search(expanded_query)

    # Step 3: Reranking (automatically tagged)
    relevant_docs = rerank(documents, user_question)

    # Step 4: Generation (automatically tagged)
    response = generate_answer(relevant_docs, user_question)

# All 4 steps automatically have:
# - workflow_id = "rag_pipeline"
# - workflow_type = "retrieval"
# Perfect for analyzing bottlenecks and performance!

### Nested Workflows and Conversations

Combine conversation and workflow tracking:

```python
# Track conversation
with conversation_context(conversation_id="user_session_789", user_id="user_789"):

    # Inside conversation, track a specific workflow
    with workflow_context(workflow_id="product_search", workflow_type="search"):
        # Search workflow steps
        results = search_products(query)
        recommendations = rank_results(results)

    # Outside workflow, still in conversation
    followup = handle_followup_question()

# Result:
# - search_products and rank_results: both conversation_id AND workflow_id
# - handle_followup_question: only conversation_id
# Perfect granularity for analysis!

Tool/Function Tracking

Track tool calls:

with tracer.start_span("gen_ai.tool.search") as span:
    l9.add_tool_attributes(
        span,
        tool_name="web_search",
        tool_type="search",
        arguments={"query": "weather"},
        result={"temp": 72},
        duration_ms=150
    )

OpenTelemetry Integration

Export to Last9

export OTEL_EXPORTER_OTLP_ENDPOINT="https://otlp.last9.io:443"
export OTEL_EXPORTER_OTLP_HEADERS="Authorization=Basic YOUR_KEY"

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter

# Setup
trace.set_tracer_provider(TracerProvider())
otlp_exporter = OTLPSpanExporter()
trace.get_tracer_provider().add_span_processor(
    BatchSpanProcessor(otlp_exporter)
)

Export to Console (Development)

from opentelemetry.sdk.trace.export import ConsoleSpanExporter

console_exporter = ConsoleSpanExporter()
trace.get_tracer_provider().add_span_processor(
    BatchSpanProcessor(console_exporter)
)

Configuration

Disable Cost Tracking

# Track tokens only, skip cost calculation
l9 = Last9GenAI(enable_cost_tracking=False)

Custom Workflow Tracker

from last9_genai import WorkflowCostTracker

tracker = WorkflowCostTracker()
l9 = Last9GenAI(workflow_tracker=tracker)

Attributes Reference

Standard OpenTelemetry (Always Set)

gen_ai.system = "openai"
gen_ai.request.model = "gpt-4o"
gen_ai.usage.input_tokens = 150
gen_ai.usage.output_tokens = 250

Last9 Extensions (Optional)

# Cost (when pricing provided)
gen_ai.usage.cost_usd = 0.00225
gen_ai.usage.cost_input_usd = 0.000375
gen_ai.usage.cost_output_usd = 0.0025

# Classification
gen_ai.l9.span.kind = "llm"  # or "tool", "prompt"

# Workflow
workflow.id = "customer_support"
workflow.total_cost_usd = 0.015
workflow.llm_calls = 3

# Conversation
gen_ai.conversation.id = "session_123"
gen_ai.conversation.turn_number = 2

# Agent (OTel GenAI semantic conventions)
gen_ai.agent.id = "support_bot_v2"
gen_ai.agent.name = "Support Bot"
gen_ai.agent.version = "2.0"

Model Pricing

No default pricing included. You provide pricing for models you use.

Finding Pricing

Anthropic: https://www.anthropic.com/pricing
OpenAI: https://openai.com/api/pricing/
Google: https://ai.google.dev/pricing
Community: https://www.llm-prices.com/

Pricing Format

All prices in USD per million tokens:

ModelPricing(
    input=3.0,   # $3 per 1M input tokens
    output=15.0  # $15 per 1M output tokens
)

Conversion:

Per-token: $0.000003 → 3.0
Per-1K: $0.003 → 3.0

Common Models (February 2026)

custom_pricing = {
    # Anthropic
    "claude-opus-4-6": ModelPricing(input=15.0, output=75.0),
    "claude-sonnet-4-5": ModelPricing(input=3.0, output=15.0),
    "claude-haiku-4-5": ModelPricing(input=0.8, output=4.0),

    # OpenAI
    "gpt-4o": ModelPricing(input=2.50, output=10.0),
    "gpt-4o-mini": ModelPricing(input=0.15, output=0.60),
    "o1": ModelPricing(input=15.0, output=60.0),

    # Google
    "gemini-1.5-pro": ModelPricing(input=1.25, output=10.0),
    "gemini-2.0-flash": ModelPricing(input=0.075, output=0.30),
}

Special Cases

Azure OpenAI:

custom_pricing = {
    "azure/gpt-4o": ModelPricing(input=2.50, output=10.0),
}

Self-hosted (free):

custom_pricing = {
    "ollama/llama3.1": ModelPricing(input=0.0, output=0.0),
}

Fine-tuned:

custom_pricing = {
    "ft:gpt-3.5-turbo:org:model:id": ModelPricing(input=12.0, output=16.0),
}

Examples

See examples/ directory:

Basic Usage:

basic_usage.py - Simple LLM tracking
openai_integration.py - OpenAI SDK
anthropic_integration.py - Anthropic SDK
langchain_integration.py - LangChain
fastapi_app.py - FastAPI web app
tool_integration.py - Function calls

Auto-Tracking (Recommended):

context_tracking.py - Context managers for automatic tracking
decorator_tracking.py - @observe() decorator pattern

Advanced:

conversation_tracking.py - Multi-turn conversations
agent_tracking.py - Agent identity tracking with OTel semantic conventions

Contributing

Contributions welcome! Please:

Fork the repo
Create a feature branch
Add tests
Submit a PR

License

MIT License - see LICENSE

Support

Issues: https://github.com/last9/python-ai-sdk/issues
Documentation: https://github.com/last9/python-ai-sdk
Last9: https://last9.io

Built with ❤️ by Last9

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.3.0

Apr 26, 2026

This version

1.2.0

Apr 20, 2026

1.0.0

Mar 2, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

last9_genai-1.2.0.tar.gz (78.6 kB view details)

Uploaded Apr 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

last9_genai-1.2.0-py3-none-any.whl (30.4 kB view details)

Uploaded Apr 20, 2026 Python 3

File details

Details for the file last9_genai-1.2.0.tar.gz.

File metadata

Download URL: last9_genai-1.2.0.tar.gz
Upload date: Apr 20, 2026
Size: 78.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for last9_genai-1.2.0.tar.gz
Algorithm	Hash digest
SHA256	`e7a7c006931ef2282ec47d7110a35dd76e1d01e6801f19e8893f4668f8c11710`
MD5	`4107c5bcb801e6cbd2de1b6e374cf6e3`
BLAKE2b-256	`7b441e88dde3b16a74cc1336a232148f90bf1bcc81cb39856181f14322333a46`

See more details on using hashes here.

File details

Details for the file last9_genai-1.2.0-py3-none-any.whl.

File metadata

Download URL: last9_genai-1.2.0-py3-none-any.whl
Upload date: Apr 20, 2026
Size: 30.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for last9_genai-1.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8cf4492b675701fd32c34df7510cdeda9791a849901af2e581ee3b1a69e38347`
MD5	`29c215bcc98739002a134821ad0b62ef`
BLAKE2b-256	`f0d31ac924dc1f3fe21eb81cbf15654560815703c3e28feba1daf86386e052d8`

See more details on using hashes here.

last9-genai 1.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Last9 GenAI - Python SDK

Overview

Features

Core Tracking

Enhanced Observability

Optional Features

Relationship to OpenTelemetry GenAI

Installation

Quick Start

Track Conversations (Recommended)

Track Workflows

Track Agents

Decorator Pattern (Zero-Touch)

Optional: Cost Tracking

Decorator Pattern (Zero-Touch)

Tags and Categories

Working with OTel Auto-Instrumentation

Capturing Prompts, Completions, and Tool Calls

Usage Examples

Multi-Turn Conversations

Complex Workflows

Tool/Function Tracking

OpenTelemetry Integration

Export to Last9

Export to Console (Development)

Configuration

Disable Cost Tracking

Custom Workflow Tracker

Attributes Reference

Standard OpenTelemetry (Always Set)

Last9 Extensions (Optional)

Model Pricing

Finding Pricing

Pricing Format

Common Models (February 2026)

Special Cases

Examples

Contributing

License

Support

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes