The official Revenium Python SDK — unified AI metering middleware for OpenAI, Anthropic, Google, Ollama, LiteLLM, Perplexity, and fal.ai.
Project description
Revenium Python SDK
The official Revenium Python SDK — unified AI metering middleware for deeply attributed AI usage metrics. Supports OpenAI, Anthropic, Google (Gemini/Vertex AI), fal.ai, Ollama, LiteLLM, and Perplexity.
Features
- Unified SDK: Single package with middleware for all major AI providers — install only what you need
- Zero Code Changes: Drop-in integration — just import and all API calls are automatically metered
- Streaming Support: Full streaming support for all providers (both sync and async)
- Decorator Support:
@revenium_metadatafor automatic metadata injection and@revenium_meterfor selective metering - Tool Metering:
@meter_toolto meter arbitrary tool/function calls alongside LLM API metering - Prompt Capture: Optional capture of prompts and responses for analytics and debugging
- Terminal Summary: Real-time cost and usage summaries in your terminal (human-readable or JSON)
- Distributed Tracing: Built-in trace visualization fields for cross-service observability
- Asynchronous Processing: Background thread management for non-blocking metering operations
- Graceful Shutdown: Ensures all metering data is properly sent even during application shutdown
- Thread-Safe: Production-ready with
contextvars-based context management for concurrent applications
Supported Providers
| Provider | Extra | Install Command |
|---|---|---|
| OpenAI | openai |
pip install revenium-python-sdk[openai] |
| Azure OpenAI | openai |
pip install revenium-python-sdk[openai] |
| Anthropic | anthropic |
pip install revenium-python-sdk[anthropic] |
| AWS Bedrock (Anthropic) | anthropic |
pip install revenium-python-sdk[anthropic] |
| Google Gemini | google-genai |
pip install revenium-python-sdk[google-genai] |
| Google Vertex AI | google-vertex |
pip install revenium-python-sdk[google-vertex] |
| Ollama | ollama |
pip install revenium-python-sdk[ollama] |
| LiteLLM (Client) | litellm |
pip install revenium-python-sdk[litellm] |
| LiteLLM (Proxy) | litellm-proxy |
pip install revenium-python-sdk[litellm-proxy] |
| Perplexity (via OpenAI) | perplexity-openai |
pip install revenium-python-sdk[perplexity-openai] |
| Perplexity (Native SDK) | perplexity-native |
pip install revenium-python-sdk[perplexity-native] |
| fal.ai | fal |
pip install revenium-python-sdk[fal] |
| LangChain | langchain |
pip install revenium-python-sdk[langchain] |
Feature Matrix
| Feature | OpenAI | Anthropic | Ollama | LiteLLM | Perplexity | fal.ai | |
|---|---|---|---|---|---|---|---|
| Chat Completions | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| Streaming | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| Embeddings | Yes | - | Yes | Yes | Yes | - | - |
| Vision/Multimodal | Yes | Yes | Yes | - | Yes | - | Yes |
| Image Generation | - | - | Yes | - | - | - | Yes |
| Video Generation | - | - | Yes | - | - | - | Yes |
| Prompt Capture | Yes | Yes | Yes | - | Yes | - | - |
| Terminal Summary | Yes | Yes | Yes | Yes | Yes | - | - |
| Azure / Bedrock | Azure | Bedrock | Vertex AI | - | All | - | - |
| LangChain Integration | Yes | - | - | - | - | - | - |
| CrewAI Integration | - | - | - | - | Yes | - | - |
| Proxy Mode | - | - | - | - | Yes | - | - |
Installation
# Core SDK only
pip install revenium-python-sdk
# With a specific provider
pip install revenium-python-sdk[openai]
# Multiple providers
pip install "revenium-python-sdk[openai,anthropic,ollama]"
Quick Start
1. Configure Environment Variables
Create a .env file in your project directory:
# Required
REVENIUM_METERING_API_KEY=hak_your_revenium_api_key_here
REVENIUM_METERING_BASE_URL=https://api.revenium.ai
# Provider API keys (set whichever you use)
OPENAI_API_KEY=sk-your_openai_key
ANTHROPIC_API_KEY=sk-ant-your_anthropic_key
GOOGLE_API_KEY=your_google_key
PERPLEXITY_API_KEY=pplx_your_key
FAL_KEY=your_fal_key
FIREWORKS_API_KEY=your_fireworks_key
# Optional
# REVENIUM_LOG_LEVEL=DEBUG
2. Import and Use
Just import the middleware for your provider. That's it - all API calls are automatically metered:
from dotenv import load_dotenv
load_dotenv()
import openai
import revenium_middleware_openai # Auto-initializes on import
client = openai.OpenAI()
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
# Usage data automatically sent to Revenium
Provider Usage Guides
OpenAI
Supports chat completions, streaming, embeddings, function calling, and vision/multimodal.
from dotenv import load_dotenv
load_dotenv()
import openai
import revenium_middleware_openai # Auto-initializes
client = openai.OpenAI()
# Basic chat completion
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello!"}],
usage_metadata={
"organizationName": "AcmeCorp",
"productName": "customer-chatbot",
"trace_id": "session-123",
"task_type": "chat"
}
)
# Streaming
stream = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Tell me a story"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)
# Embeddings
embedding = client.embeddings.create(
model="text-embedding-3-small",
input="The quick brown fox"
)
Azure OpenAI
The middleware automatically detects Azure OpenAI when using AzureOpenAI() and resolves deployment names to standard model names for accurate pricing.
from openai import AzureOpenAI
import revenium_middleware_openai
client = AzureOpenAI(
azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
api_key=os.getenv("AZURE_OPENAI_API_KEY"),
api_version="2024-02-01"
)
response = client.chat.completions.create(
model="my-gpt4-deployment", # Azure deployment name
messages=[{"role": "user", "content": "Hello!"}]
)
# Model name automatically resolved for pricing
Azure environment variables:
AZURE_OPENAI_ENDPOINT- Your Azure OpenAI endpointAZURE_OPENAI_API_KEY- Your Azure OpenAI API keyAZURE_OPENAI_DEPLOYMENT- Default deployment name
Examples: examples/openai/ - openai_basic.py, openai_streaming.py, azure_basic.py, azure_streaming.py
Anthropic
Supports messages, streaming, vision/multimodal, and AWS Bedrock integration.
from dotenv import load_dotenv
load_dotenv()
import anthropic
import revenium_middleware_anthropic # Auto-initializes
client = anthropic.Anthropic()
# Basic message
message = client.messages.create(
model="claude-3-haiku-20240307",
max_tokens=100,
messages=[{"role": "user", "content": "Hello!"}],
usage_metadata={
"organizationName": "AcmeCorp",
"productName": "support-bot",
"trace_id": "session-456"
}
)
# Streaming
with client.messages.stream(
model="claude-3-haiku-20240307",
max_tokens=200,
messages=[{"role": "user", "content": "Tell me a story"}],
usage_metadata={"task_type": "creative"}
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
Note: The middleware only wraps messages.create and messages.stream endpoints. Other Anthropic SDK features work normally but aren't metered.
AWS Bedrock
The middleware provides complete AWS Bedrock integration with automatic detection.
import anthropic
import revenium_middleware_anthropic
# Bedrock is automatically detected when AWS credentials are available
# and base_url contains 'amazonaws.com'
client = anthropic.AnthropicBedrock(
aws_region="us-east-1"
)
message = client.messages.create(
model="anthropic.claude-3-haiku-20240307-v1:0",
max_tokens=100,
messages=[{"role": "user", "content": "Hello from Bedrock!"}]
)
Provider detection automatically routes between Bedrock and direct Anthropic API based on:
- AWS credentials availability (
aws configure, IAM roles, environment variables) - Base URL detection (when
base_urlcontainsamazonaws.com) - Defaults to direct Anthropic API - Bedrock only used when explicitly configured
Bedrock environment variables:
| Variable | Description | Default |
|---|---|---|
AWS_REGION |
AWS region for Bedrock | us-east-1 |
REVENIUM_BEDROCK_DISABLE |
Set to 1 to disable Bedrock support |
Not set |
AWS authentication uses the standard credential chain: environment variables, ~/.aws/credentials, IAM roles, AWS SSO. Required permissions: bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream.
Supported Bedrock models:
| Anthropic Model | Bedrock Model ID |
|---|---|
claude-3-opus-20240229 |
anthropic.claude-3-opus-20240229-v1:0 |
claude-3-sonnet-20240229 |
anthropic.claude-3-sonnet-20240229-v1:0 |
claude-3-haiku-20240307 |
us.anthropic.claude-3-5-haiku-20241022-v1:0 |
claude-3-5-sonnet-20240620 |
anthropic.claude-3-5-sonnet-20240620-v1:0 |
claude-3-5-sonnet-20241022 |
anthropic.claude-3-5-sonnet-20241022-v2:0 |
claude-3-5-haiku-20241022 |
anthropic.claude-3-5-haiku-20241022-v1:0 |
For other models, the middleware uses the format anthropic.{model_name}.
Examples: examples/anthropic/ - anthropic-basic.py, anthropic-streaming.py, anthropic-bedrock.py, anthropic-advanced.py
Google AI (Gemini / Vertex AI)
Supports chat completions, streaming, embeddings, image generation (Imagen), video generation, and vision/multimodal. Choose between Google AI SDK (simple API key setup) or Vertex AI SDK (production-grade with full token counting).
# Google AI SDK only (Gemini Developer API)
pip install "revenium-python-sdk[google-genai]"
# Vertex AI SDK only (recommended for production)
pip install "revenium-python-sdk[google-vertex]"
Google AI SDK
from dotenv import load_dotenv
load_dotenv()
import revenium_middleware_google
from google import genai
client = genai.Client()
response = client.models.generate_content(
model="gemini-2.0-flash-001",
contents="Hello! Introduce yourself in one sentence.",
usage_metadata={
"organizationName": "AcmeCorp",
"task_type": "chat"
}
)
print(response.text)
Vertex AI SDK
from dotenv import load_dotenv
load_dotenv()
import revenium_middleware_google
import vertexai
from vertexai.generative_models import GenerativeModel
vertexai.init(project="your-gcp-project", location="us-central1")
model = GenerativeModel("gemini-2.0-flash-001")
response = model.generate_content("Hello!")
print(response.text)
Which SDK should I choose?
| Use Case | Recommended SDK | Why |
|---|---|---|
| Quick prototyping | Google AI SDK | Simple API key setup |
| Production applications | Vertex AI SDK | Full token counting, enterprise features |
| Embeddings-heavy workloads | Vertex AI SDK | Complete token tracking for embeddings |
| Enterprise/GCP environments | Vertex AI SDK | Advanced Google Cloud integration |
Note: Google AI SDK embeddings don't return token counts due to API limitations, but requests are still tracked.
Google AI environment variables:
GOOGLE_API_KEY- For Google AI SDKGOOGLE_CLOUD_PROJECT- For Vertex AI SDKGOOGLE_CLOUD_LOCATION- Vertex AI region (default:us-central1)
For Vertex AI, authenticate with: gcloud auth application-default login
Examples: examples/google/ - getting_started_google_ai.py, getting_started_vertex_ai.py, simple_streaming_test.py, simple_embeddings_test.py
Ollama
Supports chat completions, text generation, embeddings, and streaming. Works with any Ollama model.
from dotenv import load_dotenv
load_dotenv()
import ollama
import revenium_middleware_ollama # Auto-initializes
# Chat completion
response = ollama.chat(
model='qwen2.5:0.5b',
messages=[{'role': 'user', 'content': 'Why is the sky blue?'}],
usage_metadata={
"organizationName": "AcmeCorp",
"task_type": "chat"
}
)
print(response['message']['content'])
# Streaming
for chunk in ollama.chat(
model='qwen2.5:0.5b',
messages=[{'role': 'user', 'content': 'Tell me a story'}],
stream=True
):
print(chunk['message']['content'], end='', flush=True)
# Text generation
response = ollama.generate(model='qwen2.5:0.5b', prompt='Once upon a time')
# Embeddings (single and batch)
response = ollama.embed(model='nomic-embed-text', input='Hello world')
response = ollama.embed(model='nomic-embed-text', input=['Text 1', 'Text 2', 'Text 3'])
Supported endpoints: ollama.chat(), ollama.generate(), ollama.embed()
OpenAI compatibility mode: You can also use Ollama with the OpenAI SDK:
import openai
import revenium_middleware_openai
openai.api_key = 'ollama'
openai.base_url = 'http://localhost:11434/v1/'
response = openai.chat.completions.create(
model="gemma2:2b",
messages=[{"role": "user", "content": "Hello!"}],
usage_metadata={"organizationName": "AcmeCorp"}
)
Prerequisites: Ensure Ollama is running (ollama serve) before making API calls.
Examples: examples/ollama/ - getting_started.py, example_streaming.py, example_metadata.py, embeddings_example.py
LiteLLM
Supports all LLM providers available through LiteLLM with two integration patterns: client-side middleware and server-side proxy callbacks.
Client Mode
from dotenv import load_dotenv
load_dotenv()
import revenium_middleware_litellm_client.middleware # Auto-initializes
import litellm
import os
litellm.api_base = os.getenv("LITELLM_PROXY_URL")
litellm.api_key = os.getenv("LITELLM_API_KEY")
response = litellm.completion(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello!"}],
usage_metadata={
"organizationName": "AcmeCorp",
"task_type": "chat"
}
)
Proxy Mode
Add the callback to your LiteLLM config.yaml for server-side integration:
litellm_settings:
callbacks: ["revenium_middleware_litellm_proxy.middleware.proxy_handler_instance"]
When using the LiteLLM proxy, pass metadata via HTTP headers (x-revenium-*).
LiteLLM Decorators
LiteLLM provides additional tracking decorators beyond the standard @revenium_metadata and @revenium_meter:
| Decorator | Purpose |
|---|---|
@track_agent() |
Identify the AI agent |
@track_task() |
Classify the type of work |
@track_trace() |
Set trace ID for distributed tracing |
@track_organization() |
Track multi-tenant organizations |
@track_subscription() |
Track subscription-based billing |
@track_product() |
Track product-specific usage |
@track_subscriber() |
Identify end users |
@track_quality() |
Track response quality scores |
All decorators support static values, extraction from function arguments (name_from_arg), or extraction from object attributes (name_from_attr).
CrewAI Integration
pip install "revenium-middleware-litellm[crewai]"
Pre-built wrapper for tracking CrewAI agent executions. Note: CrewAI requires Python 3.12 or earlier.
LiteLLM environment variables:
LITELLM_PROXY_URL- Your LiteLLM proxy URLLITELLM_API_KEY- Your LiteLLM proxy API key
Examples: examples/litellm/ - getting_started.py, litellm_proxy_example.py, crewai_decorator_example.py
Perplexity
Supports both the OpenAI SDK (with Perplexity base URL) and the native Perplexity SDK, with streaming support.
Using OpenAI SDK
from dotenv import load_dotenv
load_dotenv()
from openai import OpenAI
import revenium_middleware_perplexity # Auto-patches OpenAI
client = OpenAI(
api_key=os.getenv("PERPLEXITY_API_KEY"),
base_url="https://api.perplexity.ai"
)
response = client.chat.completions.create(
model="sonar",
messages=[{"role": "user", "content": "What is the capital of France?"}],
usage_metadata={"organizationName": "AcmeCorp"}
)
Using Native Perplexity SDK
from perplexity import Perplexity
import revenium_middleware_perplexity # Auto-patches Perplexity
client = Perplexity(api_key=os.getenv("PERPLEXITY_API_KEY"))
response = client.chat.completions.create(
model="sonar",
messages=[{"role": "user", "content": "Hello!"}]
)
Both approaches work identically - the middleware automatically detects which SDK you're using.
Streaming:
stream = client.chat.completions.create(
model="sonar-pro",
messages=[{"role": "user", "content": "Write a poem"}],
stream=True,
usage_metadata={"task_type": "creative_writing"}
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)
Examples: examples/perplexity/ - getting_started.py, basic.py, streaming.py, example_decorator.py
fal.ai
Supports image, video, and audio generation through fal.ai with automatic media type detection.
import revenium_middleware_fal # Auto-activates
import fal_client
result = fal_client.subscribe(
"fal-ai/flux/dev",
arguments={
"prompt": "A beautiful sunset over mountains",
"image_size": "landscape_16_9"
},
usage_metadata={
"organizationName": "AcmeCorp",
"task_type": "image-generation"
}
)
for image in result.get("images", []):
print(f"Image URL: {image['url']}")
Supported methods: fal_client.run, fal_client.subscribe, fal_client.stream (and their async variants: run_async, subscribe_async, stream_async)
Media type detection: The middleware automatically detects the type of media being generated (image, video, audio) based on the application name for accurate cost tracking.
Environment variables:
FAL_KEY- Your fal.ai API key
LangChain
Callback handler that automatically tracks LLM calls, chains, tools, and agent actions.
from langchain_openai import ChatOpenAI
from revenium_middleware_langchain import ReveniumCallbackHandler
handler = ReveniumCallbackHandler(
trace_id="session-123",
agent_name="support_agent"
)
llm = ChatOpenAI(model="gpt-4", callbacks=[handler])
response = llm.invoke("Hello!")
With chains:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
prompt = ChatPromptTemplate.from_template("Tell me a joke about {topic}")
chain = prompt | llm | output_parser
result = chain.invoke({"topic": "programming"})
With agents:
from langchain_core.tools import tool
from langgraph.prebuilt import create_react_agent
@tool
def get_weather(city: str) -> str:
"""Get the weather for a city."""
return f"Sunny, 72F in {city}"
agent = create_react_agent(llm, [get_weather])
result = agent.invoke(
{"messages": [HumanMessage(content="Weather in NYC?")]},
config={"callbacks": [handler]}
)
Async support:
from revenium_middleware_langchain import AsyncReveniumCallbackHandler
handler = AsyncReveniumCallbackHandler(trace_id="async-session")
llm = ChatOpenAI(model="gpt-4", callbacks=[handler])
response = await llm.ainvoke("Hello!")
Supported providers: OpenAI, Anthropic, Google, AWS Bedrock, Azure OpenAI, Cohere, HuggingFace, Ollama. Provider is auto-detected from LangChain class name or model name prefix.
Programmatic configuration:
from revenium_middleware_langchain import ReveniumCallbackHandler, ReveniumConfig, SubscriberConfig
config = ReveniumConfig(
api_key="hak_your_api_key",
environment="production",
organization_name="my_org",
product_name="my_product",
subscriber=SubscriberConfig(id="user_123", email="user@example.com"),
)
handler = ReveniumCallbackHandler(config=config, trace_id="session-123")
Metadata Fields
Add business context to any API call by passing a usage_metadata dictionary. All fields are optional.
| Field | Description | Use Case |
|---|---|---|
trace_id |
Unique session or conversation identifier | Link multiple API calls together for debugging, session analytics, or distributed tracing |
task_type |
Type of AI task being performed | Categorize usage by workload (e.g., "chat", "code-generation", "doc-summary") for cost analysis |
subscriber.id |
Unique user identifier | Track individual user consumption for billing, rate limiting, or analytics |
subscriber.email |
User email address | Identify users for support, compliance, or usage reports |
subscriber.credential.name |
Authentication credential name | Track which API key or service account made the request |
subscriber.credential.value |
Authentication credential value | Associate usage with specific credentials for security auditing |
organizationName |
Organization or company name | Multi-tenant cost allocation, usage quotas per organization. Auto-creates if not found |
subscription_id |
Subscription plan identifier | Track usage against subscription limits, identify plan upgrade opportunities |
productName |
Your product or feature name | Attribute AI costs to specific features (e.g., "customer-chatbot", "email-assistant"). Auto-creates if not found |
agent |
AI agent or bot identifier | Distinguish between multiple AI agents or automation workflows |
response_quality_score |
Custom quality rating (0.0-1.0) | Track user satisfaction or automated quality metrics for model performance analysis |
Example:
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello!"}],
usage_metadata={
"trace_id": "conv-28a7e9d4",
"task_type": "customer-support",
"subscriber": {
"id": "user-1234",
"email": "user@example.com",
"credential": {
"name": "engineering-api-key",
"value": "sk-1234567890abcdef"
}
},
"organizationName": "AcmeCorp",
"subscription_id": "pro-plan-Q1",
"productName": "customer-support-chatbot",
"agent": "support-agent",
"response_quality_score": 0.92
}
)
Deprecation notice: The old field names organizationId, organization_id, productId, and product_id are still supported for backward compatibility but are deprecated. Use organizationName and productName for new implementations.
API Reference: Complete metadata field documentation
Trace Visualization & Distributed Tracing
Enhanced observability fields for tracking AI operations across environments, regions, and workflows. Fields can be set via environment variables (static/deployment-level defaults) or passed directly in usage_metadata (dynamic/per-request values). Direct values always take precedence.
Available Fields
| Field | Environment Variable (Fallback) | Description | Use Case |
|---|---|---|---|
environment |
REVENIUM_ENVIRONMENT (auto-detects: ENVIRONMENT, DEPLOYMENT_ENV) |
Deployment environment | Track usage across production, staging, dev |
region |
REVENIUM_REGION (auto-detects: AWS_REGION, AZURE_REGION, GCP_REGION) |
Cloud region identifier | Multi-region deployment tracking and latency analysis |
credential_alias |
REVENIUM_CREDENTIAL_ALIAS |
Human-readable API key name | Track which credential was used for rotation and auditing |
trace_type |
REVENIUM_TRACE_TYPE |
Workflow category (max 128 chars, alphanumeric/hyphens/underscores) | Group similar workflows (e.g., "customer-support", "data-analysis") |
trace_name |
REVENIUM_TRACE_NAME |
Human-readable trace label (max 256 chars) | Label trace instances (e.g., "Customer Support Chat") |
parent_transaction_id |
REVENIUM_PARENT_TRANSACTION_ID |
Parent transaction ID | Link child operations to parents across microservices |
transaction_name |
REVENIUM_TRANSACTION_NAME |
Human-friendly operation name | Label operations (e.g., "Generate Response", "Analyze Sentiment") |
retry_number |
REVENIUM_RETRY_NUMBER |
Retry attempt number (0 = first attempt) | Track retry attempts for failed operations |
Note: operation_type (e.g., CHAT, EMBED, TOOL_CALL) and operation_subtype (e.g., function_call, streaming) are automatically detected by the middleware and cannot be overridden.
Usage
Static fields via environment variables (deployment-level defaults):
# .env file
REVENIUM_ENVIRONMENT=production
REVENIUM_REGION=us-east-1
REVENIUM_CREDENTIAL_ALIAS=prod-openai-key
REVENIUM_TRACE_TYPE=customer-support
Dynamic fields via usage_metadata (per-request values):
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello!"}],
usage_metadata={
"environment": "production",
"region": "us-east-1",
"trace_type": "customer-support",
"trace_name": "Support Chat Session",
"transaction_name": "Generate Response",
"parent_transaction_id": "parent-txn-123"
}
)
Best practice: Use environment variables for static deployment configuration (environment, region, credential_alias) and pass dynamic values (trace_name, transaction_name, organizationName) directly in usage_metadata or via decorators.
Distributed Tracing Example
import uuid
workflow_id = str(uuid.uuid4())
# Step 1: Parent operation
parent_response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Analyze this document"}],
usage_metadata={
"trace_id": "analysis-session-456",
"transaction_name": "Document Analysis",
"task_type": "analysis"
}
)
# Step 2: Child operation linked to parent
child_response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Summarize findings"}],
usage_metadata={
"trace_id": "analysis-session-456",
"parent_transaction_id": parent_response.id,
"transaction_name": "Summarize Results",
"task_type": "summarization"
}
)
Decorator Support
@revenium_metadata - Automatic Metadata Injection
Automatically injects metadata into all API calls within a function's scope. Eliminates the need to pass usage_metadata to every API call.
from revenium_middleware import revenium_metadata
@revenium_metadata(
trace_id="session-12345",
task_type="customer-support",
organizationName="AcmeCorp",
environment="production"
)
def handle_customer_query(question: str) -> str:
# All API calls automatically include the decorator metadata
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": question}]
)
return response.choices[0].message.content
answer = handle_customer_query("How do I reset my password?")
Features:
- DRY Principle: Define metadata once, apply to all API calls in the function
- Composable: Decorators can be nested - inner decorators inherit and override outer ones
- API-level override:
usage_metadatapassed directly to API calls always takes precedence over decorator metadata - Async support: Works with both sync and async functions
- Thread-safe: Uses
contextvarsfor proper isolation
Nested decorators (metadata merging):
@revenium_metadata(organizationName="AcmeCorp", environment="production")
def outer_function():
# Gets: organizationName, environment
response1 = client.chat.completions.create(...)
@revenium_metadata(trace_id="inner-trace", task_type="analysis")
def inner_function():
# Gets: organizationName, environment (inherited) + trace_id, task_type (added)
response2 = client.chat.completions.create(...)
return response2
return inner_function()
API-level override:
@revenium_metadata(organizationName="AcmeCorp", task_type="default")
def mixed_metadata():
# Uses decorator metadata
response1 = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello"}]
)
# API-level metadata overrides decorator's task_type
response2 = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello"}],
usage_metadata={
"task_type": "special-override", # Overrides decorator
"trace_id": "api-level-trace" # Adds new field
# organizationName still inherited from decorator
}
)
@revenium_meter - Selective Metering
Control which functions are metered when selective metering mode is enabled. This is useful for metering only specific high-value operations while ignoring others.
Note: This decorator only has an effect when REVENIUM_SELECTIVE_METERING=true is set. By default, all API calls are metered automatically.
# Enable selective metering
export REVENIUM_SELECTIVE_METERING=true
from revenium_middleware import revenium_meter, revenium_metadata
@revenium_meter()
@revenium_metadata(task_type="premium-feature", organizationName="PremiumTier")
def premium_feature(prompt: str) -> str:
# This WILL be metered (decorated with @revenium_meter)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": prompt}]
)
return response.choices[0].message.content
def free_feature(prompt: str) -> str:
# This will NOT be metered (no @revenium_meter decorator)
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": prompt}]
)
return response.choices[0].message.content
Accepted values for REVENIUM_SELECTIVE_METERING:
"true","1","yes","on"(case-insensitive) - Selective metering enabled"false","0","no","off", or unset - All calls metered (default)
Decorator order matters: Place @revenium_meter before @revenium_metadata (outer to inner).
Tool Metering
The @meter_tool decorator lets you meter arbitrary tool/function calls (web scrapers, database lookups, API fetchers, image generators, etc.) alongside your automatic LLM API metering. Available via revenium_metering v6.8.2+.
import os
from revenium_middleware import meter_tool, configure
# Configure the metering client for tool calls
configure(
metering_url=os.getenv("REVENIUM_METERING_BASE_URL", "https://api.revenium.ai"),
api_key=os.environ["REVENIUM_METERING_API_KEY"],
)
# Decorate any tool function to automatically meter it
@meter_tool("customer-database", operation="lookup", agent="support-bot")
def lookup_customer(customer_id: str) -> dict:
"""Timing and success/failure are automatically tracked."""
return {"name": "Jane Smith", "plan": "Enterprise"}
# The decorator reports the tool call to Revenium automatically
result = lookup_customer("CUST-42")
Manual reporting:
from revenium_middleware import report_tool_call
report_tool_call(
tool_id="my-tool",
operation="fetch",
duration_ms=1234,
success=True,
usage_metadata={"records": 42},
)
Prompt Capture
Optional capture of prompts and responses for analytics and debugging. Disabled by default to protect sensitive data.
Enable
export REVENIUM_CAPTURE_PROMPTS=true
What Gets Captured
| Field | Description | Source |
|---|---|---|
system_prompt |
System prompt content | From system parameter / system message |
input_messages |
User/assistant messages as JSON | From messages parameter |
output_response |
Assistant's response content | From response content blocks |
prompts_truncated |
Truncation flag | Set to true if any field exceeded 50,000 characters |
Each field has a maximum length of 50,000 characters. If exceeded, it's truncated with a ...[TRUNCATED] marker.
Example
import os
os.environ["REVENIUM_CAPTURE_PROMPTS"] = "true"
import revenium_middleware_openai
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"}
],
usage_metadata={"organizationName": "DemoOrg"}
)
# System prompt, input messages, and output response are now captured
Prompt capture works with both streaming and non-streaming requests, and with multimodal content (text, images, etc.).
Security Considerations
- Prompts may contain sensitive user data
- Responses may include confidential information
- Only enable in environments where data capture is appropriate
- Ensure compliance with your data privacy policies
- Use selective metering with
@revenium_meterto control which calls are captured
Terminal Summary Output
Display a cost and usage summary in your terminal after each API request. Useful for development, debugging, and monitoring AI costs in real-time.
Configuration
| Environment Variable | Values | Description |
|---|---|---|
REVENIUM_PRINT_SUMMARY |
false (default), true or human, json |
Controls output format |
REVENIUM_TEAM_ID |
Your team ID | Required to fetch and display cost information |
# Enable human-readable output
export REVENIUM_PRINT_SUMMARY=human
# Required for cost display (find in Revenium web app)
export REVENIUM_TEAM_ID=your-team-id-here
Human-Readable Format
============================================================
REVENIUM USAGE SUMMARY
============================================================
Model: gpt-4o-mini
Provider: OPENAI
Duration: 1.23s
Token Usage:
Input Tokens: 150
Output Tokens: 250
Total Tokens: 400
Cost: $0.000045
Trace ID: abc-123
============================================================
JSON Format
{"model":"gpt-4o-mini","provider":"OPENAI","durationSeconds":1.23,"inputTokenCount":150,"outputTokenCount":250,"totalTokenCount":400,"cost":0.000045,"costStatus":"available","traceId":"abc-123"}
Cost Status
| Scenario | Display |
|---|---|
| Cost available | $0.000045 |
REVENIUM_TEAM_ID set, cost pending |
Pending (aggregating... check Revenium dashboard) |
REVENIUM_TEAM_ID not set |
Add REVENIUM_TEAM_ID to see pricing |
Configuration Reference
Required Environment Variables
| Variable | Description |
|---|---|
REVENIUM_METERING_API_KEY |
Your Revenium API key (starts with hak_ or rev_) |
Optional Environment Variables
| Variable | Default | Description |
|---|---|---|
REVENIUM_METERING_BASE_URL |
https://api.revenium.ai |
Revenium API endpoint |
REVENIUM_LOG_LEVEL |
INFO |
Log level: DEBUG, INFO, WARNING, ERROR, CRITICAL |
REVENIUM_CAPTURE_PROMPTS |
false |
Enable prompt capture |
REVENIUM_PRINT_SUMMARY |
false |
Terminal output: false, true/human, json |
REVENIUM_SELECTIVE_METERING |
false |
Only meter @revenium_meter decorated functions |
REVENIUM_TEAM_ID |
- | Team ID for cost display in terminal summary |
REVENIUM_ENVIRONMENT |
- | Deployment environment (auto-detects from ENVIRONMENT, DEPLOYMENT_ENV) |
REVENIUM_REGION |
- | Cloud region (auto-detects from AWS_REGION, AZURE_REGION, GCP_REGION) |
REVENIUM_CREDENTIAL_ALIAS |
- | Human-readable API key name |
REVENIUM_TRACE_TYPE |
- | Workflow category identifier |
REVENIUM_TRACE_NAME |
- | Human-readable trace label |
REVENIUM_PARENT_TRANSACTION_ID |
- | Parent transaction ID for distributed tracing |
REVENIUM_TRANSACTION_NAME |
- | Human-friendly operation name |
REVENIUM_RETRY_NUMBER |
- | Retry attempt number |
REVENIUM_BEDROCK_DISABLE |
- | Set to 1 to disable Bedrock auto-detection |
Provider-Specific Environment Variables
| Variable | Provider | Description |
|---|---|---|
OPENAI_API_KEY |
OpenAI | OpenAI API key |
AZURE_OPENAI_ENDPOINT |
Azure OpenAI | Azure endpoint URL |
AZURE_OPENAI_API_KEY |
Azure OpenAI | Azure API key |
AZURE_OPENAI_DEPLOYMENT |
Azure OpenAI | Default deployment name |
ANTHROPIC_API_KEY |
Anthropic | Anthropic API key |
AWS_REGION |
Bedrock | AWS region for Bedrock (default: us-east-1) |
GOOGLE_API_KEY |
Google AI | Google AI SDK API key |
GOOGLE_CLOUD_PROJECT |
Vertex AI | GCP project ID |
GOOGLE_CLOUD_LOCATION |
Vertex AI | GCP location (default: us-central1) |
PERPLEXITY_API_KEY |
Perplexity | Perplexity API key |
FAL_KEY |
fal.ai | fal.ai API key |
LITELLM_PROXY_URL |
LiteLLM | LiteLLM proxy URL |
LITELLM_API_KEY |
LiteLLM | LiteLLM proxy API key |
Troubleshooting
| Issue | Solution |
|---|---|
| Middleware not working | Verify REVENIUM_METERING_API_KEY is set correctly (must start with hak_ or rev_) |
| No data in dashboard | Enable debug logging with REVENIUM_LOG_LEVEL=DEBUG |
| Import errors | Ensure the correct extra is installed (e.g., pip install revenium-python-sdk[openai]) |
| Azure: wrong model name | Middleware auto-resolves deployment names; check with debug logging |
| Bedrock: AccessDenied | Ensure bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream permissions |
| Bedrock: requests go to Anthropic | Verify AWS credentials: aws sts get-caller-identity |
| Google: embeddings show 0 tokens | Expected with Google AI SDK; use Vertex AI for full token counting |
| Google: "No module named 'vertexai'" | Install correct extra: pip install "revenium-python-sdk[google-vertex]" |
| Vertex AI: authentication errors | Run gcloud auth application-default login |
| Ollama: connection errors | Ensure Ollama is running: ollama serve |
| LangChain: provider shows "unknown" | Ensure you're using a supported LangChain LLM class |
| Streaming errors | Check provider credentials; middleware auto-falls back gracefully |
Debug mode: Set REVENIUM_LOG_LEVEL=DEBUG to see detailed provider detection, routing decisions, and metering payloads.
Force direct Anthropic API: Set REVENIUM_BEDROCK_DISABLE=1 to disable Bedrock auto-detection.
Check initialization status: Use revenium_middleware_<provider>.is_initialized() to verify setup.
Logging
This module uses Python's standard logging system. Control the log level with the REVENIUM_LOG_LEVEL environment variable:
# Enable debug logging
export REVENIUM_LOG_LEVEL=DEBUG
# Or when running your script
REVENIUM_LOG_LEVEL=DEBUG python your_script.py
Available log levels:
DEBUG: Detailed debugging information (provider detection, routing decisions, metering payloads)INFO: General information (default)WARNING: Warning messages onlyERROR: Error messages onlyCRITICAL: Critical error messages only
Compatibility
- Python 3.8+
- Works with all supported AI provider SDKs (latest versions recommended)
- Thread-safe and production-ready for concurrent applications
Documentation
For detailed documentation, visit docs.revenium.io
Contributing
See CONTRIBUTING.md
Code of Conduct
Security
See SECURITY.md
License
This project is licensed under the MIT License - see the LICENSE file for details.
Support
For issues, feature requests, or contributions:
- Website: www.revenium.ai
- GitHub Repository: revenium/revenium-python-sdk
- Issues: Report bugs or request features
- Documentation: docs.revenium.io
- Email: support@revenium.io
Built by Revenium
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file revenium_python_sdk-0.1.3.tar.gz.
File metadata
- Download URL: revenium_python_sdk-0.1.3.tar.gz
- Upload date:
- Size: 176.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f9a7b8649767a651e0a94427d2c9ddad63cd6f2002ce548bc7ef3033c40c8bb1
|
|
| MD5 |
5022c5b4aaeb31f7f7ba2a1cce8c96a9
|
|
| BLAKE2b-256 |
9d828b90bef7cba1ef0b7c47d24fcc44bcce742c05c29dc157394a49677815fd
|
File details
Details for the file revenium_python_sdk-0.1.3-py3-none-any.whl.
File metadata
- Download URL: revenium_python_sdk-0.1.3-py3-none-any.whl
- Upload date:
- Size: 186.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c2d8296e90143eb87d6f773616bdaa416930ab899881621018078da449e06f2c
|
|
| MD5 |
9fbbba93f92a9186969217518c03d148
|
|
| BLAKE2b-256 |
f27d682a5a49f7f5094f04a3d0b0b22108a7dfb98449fa0a3879f5f45c8ab26b
|