Last9 observability attributes for OpenTelemetry GenAI spans - track costs, workflows, and conversations in LLM applications
Project description
Last9 GenAI - Python SDK
OpenTelemetry extension for LLM observability: track conversations, workflows, and costs
Overview
Track conversations and workflows in your LLM applications with automatic context propagation. Built on OpenTelemetry for seamless integration with your existing observability stack.
Not a replacement for OTel auto-instrumentation — works alongside it or standalone.
Key Features:
- 🎯 Conversation Tracking: Automatic multi-turn conversation tracking with
conversation_context - 🤖 Agent Tracking: First-class agent identity with
agent_context(OTelgen_ai.agent.*semantic conventions) - 🔄 Workflow Management: Track complex multi-step AI workflows with
workflow_context - 🎨 Zero-Touch Instrumentation:
@observe()decorator for automatic tracking - 📊 Context Propagation: Thread-safe attribute tracking across nested operations
- 💰 Optional Cost Tracking: Bring your own pricing for cost monitoring
- 🏷️ Span Classification: Filter by type (llm/tool/chain/agent/prompt)
Features
Core Tracking
- 🎯 Conversation Tracking: Multi-turn conversations with
gen_ai.conversation.idand turn numbers - 🤖 Agent Identity: Track agents with
gen_ai.agent.id,gen_ai.agent.name,gen_ai.agent.version(OTel semantic conventions) - 🔄 Workflow Management: Track multi-step AI operations across LLM calls, tools, and retrievals
- 📊 Auto-Context Propagation: Thread-safe context managers that automatically tag all nested operations
- 🎨 Decorator Pattern:
@observe()for zero-touch instrumentation with full input/output/latency tracking - 🔧 SpanProcessor: Automatic context enrichment for all spans in your application
Enhanced Observability
- 🏷️ Span Classification:
gen_ai.l9.span.kindfor filtering (llm/tool/chain/agent/prompt) - 🛠️ Tool/Function Tracking: Enhanced attributes for function calls and tool usage
- ⚡ Performance Metrics: Response times, token counts, and quality scores
- 🌐 Provider Agnostic: Works with OpenAI, Anthropic, Google, Cohere, etc.
- 📏 Standard Attributes: Full OpenTelemetry
gen_ai.*semantic conventions
Optional Features
- 💰 Cost Tracking: Bring your own model pricing for cost monitoring
- 💸 Workflow Costing: Aggregate costs across multi-step operations
Relationship to OpenTelemetry GenAI
This is an EXTENSION, not a replacement:
| Package | Purpose | Approach |
|---|---|---|
OTel GenAIopentelemetry-instrumentation-openai-v2 |
Auto-instrument LLM SDKs | Automatic (monkey-patching) |
Last9 GenAIlast9-genai |
Add conversation/workflow tracking | Context-based enrichment |
You can use:
- Last9 GenAI alone - Full conversation and workflow tracking
- Both together - OTel auto-traces + Last9 adds conversation/workflow context (recommended!)
See Working with OTel Auto-Instrumentation for combined usage.
Installation
Basic:
pip install last9-genai
With OTLP export (recommended):
pip install last9-genai[otlp]
Requirements:
- Python 3.10+
opentelemetry-api>=1.20.0opentelemetry-sdk>=1.20.0
Quick Start
Note: The examples below use client to represent your LLM client. Initialize your preferred provider:
# OpenAI
from openai import OpenAI
client = OpenAI()
# Or Anthropic
from anthropic import Anthropic
anthropic_client = Anthropic()
# Or any other provider (Google, Cohere, etc.)
The SDK works with any LLM provider - just use your client normally!
Track Conversations (Recommended)
Automatically track multi-turn conversations with zero manual instrumentation:
from last9_genai import conversation_context, Last9SpanProcessor
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
# Setup tracing with Last9 processor
provider = TracerProvider()
trace.set_tracer_provider(provider)
provider.add_span_processor(Last9SpanProcessor())
# Track conversations automatically - works with any LLM provider
with conversation_context(conversation_id="session_123", user_id="user_456"):
# OpenAI
response1 = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)
# Anthropic (same context!)
response2 = anthropic_client.messages.create(
model="claude-sonnet-4",
messages=[{"role": "user", "content": "How are you?"}]
)
# Both calls automatically have conversation_id = "session_123"!
Track Workflows
Track complex multi-step AI operations:
from last9_genai import workflow_context
# Track entire workflow with automatic tagging
with workflow_context(workflow_id="rag_search", workflow_type="retrieval"):
# All operations automatically tagged with workflow_id
docs = retrieve_documents(query) # Tagged
context = rerank_documents(docs) # Tagged
response = generate_answer(context) # Tagged
# Full workflow visibility with zero manual instrumentation!
# Nest workflows and conversations
with conversation_context(conversation_id="support_123"):
with workflow_context(workflow_id="order_lookup"):
# Both conversation AND workflow tracked automatically
result = lookup_and_respond()
Track Agents
Track agent identity using OTel GenAI semantic conventions (gen_ai.agent.*):
from last9_genai import agent_context
# Track agent identity — all child spans get gen_ai.agent.* attributes
with agent_context(agent_id="support_bot_v2", agent_name="Support Bot", agent_version="2.0"):
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Help me with my order"}]
)
# Span automatically has gen_ai.agent.id, gen_ai.agent.name, gen_ai.agent.version
# Nest with conversations for full context
with conversation_context(conversation_id="session_123", user_id="user_456"):
with agent_context(agent_id="router_agent", agent_name="Router"):
route = classify_intent(query)
with agent_context(agent_id="support_agent", agent_name="Support"):
response = handle_support(query)
# Each agent's spans are tagged separately, both share the conversation
Decorator Pattern (Zero-Touch)
Use @observe() for automatic tracking of everything:
from last9_genai import observe
@observe() # That's it!
def call_llm(prompt: str):
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": prompt}]
)
return response
# Automatically tracks:
# - Input (prompt)
# - Output (response)
# - Latency (span duration)
# - Context (conversation_id, workflow_id if set)
# Works seamlessly with context managers
with conversation_context(conversation_id="session_456"):
response = call_llm("Explain quantum computing")
# Span automatically has conversation_id!
Optional: Cost Tracking
Add cost monitoring by providing model pricing:
from last9_genai import ModelPricing
# Add pricing when creating processor
processor = Last9SpanProcessor(custom_pricing={
"gpt-4o": ModelPricing(input=2.50, output=10.0),
"claude-sonnet-4-5": ModelPricing(input=3.0, output=15.0),
})
# Or with decorator
pricing = {"gpt-4o": ModelPricing(input=2.50, output=10.0)}
@observe(pricing=pricing)
def call_llm(prompt: str):
# Now also tracks cost automatically
return client.chat.completions.create(...)
Decorator Pattern (Zero-Touch)
Use @observe() decorator for automatic tracking of input/output, latency, and cost:
from last9_genai import observe, ModelPricing
pricing = {"gpt-4o": ModelPricing(input=2.50, output=10.0)}
@observe(pricing=pricing)
def call_openai(prompt: str):
"""Automatically tracks everything!"""
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": prompt}]
)
return response
# That's it! Automatically tracks:
# - Input (prompt)
# - Output (response)
# - Latency (span duration)
# - Cost (calculated from usage)
# - Metadata (from context)
# Works with context too:
with conversation_context(conversation_id="session_123"):
response = call_openai("Hello!")
# Span automatically has conversation_id!
Tags and Categories
Add tags and categories for better filtering and organization in your observability platform:
from last9_genai import observe
@observe(
tags=["production", "customer_support"],
metadata={
"category": "customer_support", # Appears in Last9 dashboard Category column
"version": "1.0.0",
"priority": "high"
}
)
def handle_support_query(query: str):
"""Categorized LLM call with metadata"""
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": query}]
)
return response
# Categories automatically appear in Last9 dashboard:
# - Category column in traces table
# - Category filter dropdown
# - Enhanced trace details
# Use underscores for multi-word categories:
@observe(metadata={"category": "data_analysis"}) # Shows as "data analysis"
def analyze_data(data: str):
return client.chat.completions.create(...)
Common categories:
customer_support,conversational_ai,code_assistantdata_analysis,content_generation,summarizationtranslation,research,qa_automation
Working with OTel Auto-Instrumentation
Recommended: Combine OTel auto-instrumentation with Last9 extensions:
# Step 1: Auto-instrument with OpenTelemetry (standard attributes)
from opentelemetry.instrumentation.openai_v2 import OpenAIInstrumentor
OpenAIInstrumentor().instrument()
# Step 2: Add Last9 extensions (cost, workflows)
from last9_genai import Last9GenAI, ModelPricing
l9 = Last9GenAI(custom_pricing={
"gpt-4o": ModelPricing(input=2.50, output=10.0),
})
# Now make LLM calls
from openai import OpenAI
client = OpenAI()
# OTel automatically traces this call (standard attributes)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)
# Last9 adds cost on top of auto-traced span
from opentelemetry import trace
span = trace.get_current_span()
usage = {
"input_tokens": response.usage.prompt_tokens,
"output_tokens": response.usage.completion_tokens,
}
cost = l9.add_llm_cost_attributes(span, "gpt-4o", usage)
print(f"Cost: ${cost.total:.6f}")
Result: You get standard OTel attributes (automatic) + Last9 cost/workflow (manual).
Capturing Prompts, Completions, and Tool Calls
opentelemetry-instrumentation-openai-v2 (v2.x) follows the new OpenTelemetry
GenAI semantic conventions and emits message content, tool calls, and
completions as OTel log events, not as span attributes. The Last9 LLM
dashboard reads span attributes / events, so without a bridge those payloads
never reach the dashboard.
Last9LogToSpanProcessor listens to those log events and promotes their
payloads onto the currently active span:
gen_ai.prompt(JSON array of prompt messages)gen_ai.completion(JSON array of completion choices)- span events
gen_ai.content.prompt/gen_ai.content.completion - indexed
gen_ai.prompt.{i}.*/gen_ai.completion.{i}.*(AgentOps / Traceloop compatible)
from opentelemetry import trace, _logs
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk._logs import LoggerProvider
from opentelemetry.instrumentation.openai_v2 import OpenAIInstrumentor
from last9_genai import Last9SpanProcessor, Last9LogToSpanProcessor
log_bridge = Last9LogToSpanProcessor()
tracer_provider = TracerProvider()
tracer_provider.add_span_processor(Last9SpanProcessor(log_processor=log_bridge))
trace.set_tracer_provider(tracer_provider)
logger_provider = LoggerProvider()
logger_provider.add_log_record_processor(log_bridge)
_logs.set_logger_provider(logger_provider)
import os
os.environ["OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT"] = "true"
OpenAIInstrumentor().instrument(logger_provider=logger_provider)
After this, every LLM call instrumented by openai-v2 has its full prompt
and completion content available on the span.
Python 3.14 users: pin
wrapt<2.opentelemetry-instrumentation-openai-v22.3b0 callswrap_function_wrapper(module=..., name=..., wrapper=...)and wrapt 2.0 renamed the first kwarg totarget=. Without the pin, instrumentation fails silently and no log events are emitted.
Usage Examples
Multi-Turn Conversations
Track conversations across multiple turns automatically:
from last9_genai import conversation_context
# Track a complete conversation session
with conversation_context(conversation_id="support_session_456", user_id="user_456"):
# Turn 1
response1 = client.chat.completions.create(
messages=[{"role": "user", "content": "I need help with my order"}]
)
# Turn 2
response2 = client.chat.completions.create(
messages=[
{"role": "user", "content": "I need help with my order"},
{"role": "assistant", "content": response1.choices[0].message.content},
{"role": "user", "content": "Order #12345"}
]
)
# Both calls automatically tagged with:
# - conversation_id = "support_session_456"
# - user_id = "user_456"
# All turns linked together for analysis!
Complex Workflows
Track multi-step AI workflows with automatic tagging:
from last9_genai import workflow_context
# RAG workflow example
with workflow_context(workflow_id="rag_pipeline", workflow_type="retrieval"):
# Step 1: Query expansion (automatically tagged)
expanded_query = expand_query(user_question)
# Step 2: Retrieval (automatically tagged)
documents = vector_search(expanded_query)
# Step 3: Reranking (automatically tagged)
relevant_docs = rerank(documents, user_question)
# Step 4: Generation (automatically tagged)
response = generate_answer(relevant_docs, user_question)
# All 4 steps automatically have:
# - workflow_id = "rag_pipeline"
# - workflow_type = "retrieval"
# Perfect for analyzing bottlenecks and performance!
### Nested Workflows and Conversations
Combine conversation and workflow tracking:
```python
# Track conversation
with conversation_context(conversation_id="user_session_789", user_id="user_789"):
# Inside conversation, track a specific workflow
with workflow_context(workflow_id="product_search", workflow_type="search"):
# Search workflow steps
results = search_products(query)
recommendations = rank_results(results)
# Outside workflow, still in conversation
followup = handle_followup_question()
# Result:
# - search_products and rank_results: both conversation_id AND workflow_id
# - handle_followup_question: only conversation_id
# Perfect granularity for analysis!
Tool/Function Tracking
Track tool calls:
with tracer.start_span("gen_ai.tool.search") as span:
l9.add_tool_attributes(
span,
tool_name="web_search",
tool_type="search",
arguments={"query": "weather"},
result={"temp": 72},
duration_ms=150
)
OpenTelemetry Integration
Export to Last9
export OTEL_EXPORTER_OTLP_ENDPOINT="https://otlp.last9.io:443"
export OTEL_EXPORTER_OTLP_HEADERS="Authorization=Basic YOUR_KEY"
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
# Setup
trace.set_tracer_provider(TracerProvider())
otlp_exporter = OTLPSpanExporter()
trace.get_tracer_provider().add_span_processor(
BatchSpanProcessor(otlp_exporter)
)
Export to Console (Development)
from opentelemetry.sdk.trace.export import ConsoleSpanExporter
console_exporter = ConsoleSpanExporter()
trace.get_tracer_provider().add_span_processor(
BatchSpanProcessor(console_exporter)
)
Configuration
Disable Cost Tracking
# Track tokens only, skip cost calculation
l9 = Last9GenAI(enable_cost_tracking=False)
Custom Workflow Tracker
from last9_genai import WorkflowCostTracker
tracker = WorkflowCostTracker()
l9 = Last9GenAI(workflow_tracker=tracker)
Attributes Reference
Standard OpenTelemetry (Always Set)
gen_ai.system = "openai"
gen_ai.request.model = "gpt-4o"
gen_ai.usage.input_tokens = 150
gen_ai.usage.output_tokens = 250
Last9 Extensions (Optional)
# Cost (when pricing provided)
gen_ai.usage.cost_usd = 0.00225
gen_ai.usage.cost_input_usd = 0.000375
gen_ai.usage.cost_output_usd = 0.0025
# Classification
gen_ai.l9.span.kind = "llm" # or "tool", "prompt"
# Workflow
workflow.id = "customer_support"
workflow.total_cost_usd = 0.015
workflow.llm_calls = 3
# Conversation
gen_ai.conversation.id = "session_123"
gen_ai.conversation.turn_number = 2
# Agent (OTel GenAI semantic conventions)
gen_ai.agent.id = "support_bot_v2"
gen_ai.agent.name = "Support Bot"
gen_ai.agent.version = "2.0"
Model Pricing
No default pricing included. You provide pricing for models you use.
Finding Pricing
- Anthropic: https://www.anthropic.com/pricing
- OpenAI: https://openai.com/api/pricing/
- Google: https://ai.google.dev/pricing
- Community: https://www.llm-prices.com/
Pricing Format
All prices in USD per million tokens:
ModelPricing(
input=3.0, # $3 per 1M input tokens
output=15.0 # $15 per 1M output tokens
)
Conversion:
- Per-token:
$0.000003→3.0 - Per-1K:
$0.003→3.0
Common Models (February 2026)
custom_pricing = {
# Anthropic
"claude-opus-4-6": ModelPricing(input=15.0, output=75.0),
"claude-sonnet-4-5": ModelPricing(input=3.0, output=15.0),
"claude-haiku-4-5": ModelPricing(input=0.8, output=4.0),
# OpenAI
"gpt-4o": ModelPricing(input=2.50, output=10.0),
"gpt-4o-mini": ModelPricing(input=0.15, output=0.60),
"o1": ModelPricing(input=15.0, output=60.0),
# Google
"gemini-1.5-pro": ModelPricing(input=1.25, output=10.0),
"gemini-2.0-flash": ModelPricing(input=0.075, output=0.30),
}
Special Cases
Azure OpenAI:
custom_pricing = {
"azure/gpt-4o": ModelPricing(input=2.50, output=10.0),
}
Self-hosted (free):
custom_pricing = {
"ollama/llama3.1": ModelPricing(input=0.0, output=0.0),
}
Fine-tuned:
custom_pricing = {
"ft:gpt-3.5-turbo:org:model:id": ModelPricing(input=12.0, output=16.0),
}
Examples
See examples/ directory:
Basic Usage:
basic_usage.py- Simple LLM trackingopenai_integration.py- OpenAI SDKanthropic_integration.py- Anthropic SDKlangchain_integration.py- LangChainfastapi_app.py- FastAPI web apptool_integration.py- Function calls
Auto-Tracking (Recommended):
context_tracking.py- Context managers for automatic trackingdecorator_tracking.py- @observe() decorator pattern
Advanced:
conversation_tracking.py- Multi-turn conversationsagent_tracking.py- Agent identity tracking with OTel semantic conventions
Contributing
Contributions welcome! Please:
- Fork the repo
- Create a feature branch
- Add tests
- Submit a PR
License
MIT License - see LICENSE
Support
- Issues: https://github.com/last9/python-ai-sdk/issues
- Documentation: https://github.com/last9/python-ai-sdk
- Last9: https://last9.io
Built with ❤️ by Last9
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file last9_genai-1.2.0.tar.gz.
File metadata
- Download URL: last9_genai-1.2.0.tar.gz
- Upload date:
- Size: 78.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e7a7c006931ef2282ec47d7110a35dd76e1d01e6801f19e8893f4668f8c11710
|
|
| MD5 |
4107c5bcb801e6cbd2de1b6e374cf6e3
|
|
| BLAKE2b-256 |
7b441e88dde3b16a74cc1336a232148f90bf1bcc81cb39856181f14322333a46
|
File details
Details for the file last9_genai-1.2.0-py3-none-any.whl.
File metadata
- Download URL: last9_genai-1.2.0-py3-none-any.whl
- Upload date:
- Size: 30.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8cf4492b675701fd32c34df7510cdeda9791a849901af2e581ee3b1a69e38347
|
|
| MD5 |
29c215bcc98739002a134821ad0b62ef
|
|
| BLAKE2b-256 |
f0d31ac924dc1f3fe21eb81cbf15654560815703c3e28feba1daf86386e052d8
|