The official Revenium Python SDK — unified AI metering middleware for OpenAI, Anthropic, Google, Ollama, LiteLLM, Perplexity, and fal.ai.

These details have not been verified by PyPI

Project links

Project description

Revenium Python SDK

The official Revenium Python SDK — unified AI metering middleware for deeply attributed AI usage metrics. Supports OpenAI, Anthropic, Google (Gemini/Vertex AI), fal.ai, Ollama, LiteLLM, and Perplexity.

Features

Unified SDK: Single package with middleware for all major AI providers — install only what you need
Zero Code Changes: Drop-in integration — just import and all API calls are automatically metered
Streaming Support: Full streaming support for all providers (both sync and async)
Decorator Support: @revenium_metadata for automatic metadata injection and @revenium_meter for selective metering
Tool Metering: @meter_tool to meter arbitrary tool/function calls alongside LLM API metering
Prompt Capture: Optional capture of prompts and responses for analytics and debugging
Terminal Summary: Real-time cost and usage summaries in your terminal (human-readable or JSON)
Distributed Tracing: Built-in trace visualization fields for cross-service observability
Asynchronous Processing: Background thread management for non-blocking metering operations
Graceful Shutdown: Ensures all metering data is properly sent even during application shutdown
Thread-Safe: Production-ready with contextvars-based context management for concurrent applications

Supported Providers

Provider	Extra	Install Command
OpenAI	`openai`	`pip install revenium-python-sdk[openai]`
Azure OpenAI	`openai`	`pip install revenium-python-sdk[openai]`
Anthropic	`anthropic`	`pip install revenium-python-sdk[anthropic]`
AWS Bedrock (Anthropic)	`anthropic`	`pip install revenium-python-sdk[anthropic]`
Google Gemini	`google-genai`	`pip install revenium-python-sdk[google-genai]`
Google Vertex AI	`google-vertex`	`pip install revenium-python-sdk[google-vertex]`
Ollama	`ollama`	`pip install revenium-python-sdk[ollama]`
LiteLLM (Client)	`litellm`	`pip install revenium-python-sdk[litellm]`
LiteLLM (Proxy)	`litellm-proxy`	`pip install revenium-python-sdk[litellm-proxy]`
Perplexity (via OpenAI)	`perplexity-openai`	`pip install revenium-python-sdk[perplexity-openai]`
Perplexity (Native SDK)	`perplexity-native`	`pip install revenium-python-sdk[perplexity-native]`
fal.ai	`fal`	`pip install revenium-python-sdk[fal]`
LangChain	`langchain`	`pip install revenium-python-sdk[langchain]`

Feature Matrix

Feature	OpenAI	Anthropic	Google	Ollama	LiteLLM	Perplexity	fal.ai
Chat Completions	Yes	Yes	Yes	Yes	Yes	Yes	Yes
Streaming	Yes	Yes	Yes	Yes	Yes	Yes	Yes
Embeddings	Yes	-	Yes	Yes	Yes	-	-
Vision/Multimodal	Yes	Yes	Yes	-	Yes	-	Yes
Image Generation	-	-	Yes	-	-	-	Yes
Video Generation	-	-	Yes	-	-	-	Yes
Prompt Capture	Yes	Yes	Yes	-	Yes	-	-
Terminal Summary	Yes	Yes	Yes	Yes	Yes	-	-
Azure / Bedrock	Azure	Bedrock	Vertex AI	-	All	-	-
LangChain Integration	Yes	-	-	-	-	-	-
CrewAI Integration	-	-	-	-	Yes	-	-
Proxy Mode	-	-	-	-	Yes	-	-

Installation

# Core SDK only
pip install revenium-python-sdk

# With a specific provider
pip install revenium-python-sdk[openai]

# Multiple providers
pip install "revenium-python-sdk[openai,anthropic,ollama]"

Quick Start

1. Configure Environment Variables

Create a .env file in your project directory:

# Required
REVENIUM_METERING_API_KEY=hak_your_revenium_api_key_here
REVENIUM_METERING_BASE_URL=https://api.revenium.ai

# Provider API keys (set whichever you use)
OPENAI_API_KEY=sk-your_openai_key
ANTHROPIC_API_KEY=sk-ant-your_anthropic_key
GOOGLE_API_KEY=your_google_key
PERPLEXITY_API_KEY=pplx_your_key
FAL_KEY=your_fal_key
FIREWORKS_API_KEY=your_fireworks_key

# Optional
# REVENIUM_LOG_LEVEL=DEBUG

2. Import and Use

Just import the middleware for your provider. That's it - all API calls are automatically metered:

from dotenv import load_dotenv
load_dotenv()

import openai
import revenium_middleware_openai  # Auto-initializes on import

client = openai.OpenAI()
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
# Usage data automatically sent to Revenium

Agentic Outcomes (Outcome-Based Metering)

Emit per-agent terminal outcomes (CONVERTED, DEFLECTED, ESCALATED) alongside completion and tool-event records, so dashboards show business value next to AI cost.

from revenium_middleware.agentic_outcomes import AgenticOutcomeClient, AgenticOutcomeSettings

settings = AgenticOutcomeSettings(api_key="rev_sk_...")
client = AgenticOutcomeClient(settings)

client.emit_completion(...)                # one per LLM call
client.emit_tool_event(...)                # one per tool / step
client.report_outcome(job_id, {...})       # close the job with a terminal outcome
client.close()

The job is created implicitly by the first metric ingested for agenticJobId. Call client.create_job(job_id) explicitly if you need to record an agent run before emitting any metrics.

See examples/agentic_outcomes/ for runnable demos (sales / coding / support) with configurable failure rates and outcome distributions.

API reference: docs.revenium.io · per-endpoint reference at revenium.readme.io/reference/meter_ai_completion.

Provider Usage Guides

OpenAI

Supports chat completions, streaming, embeddings, function calling, and vision/multimodal.

from dotenv import load_dotenv
load_dotenv()

import openai
import revenium_middleware_openai  # Auto-initializes

client = openai.OpenAI()

# Basic chat completion
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}],
    usage_metadata={
        "organizationName": "AcmeCorp",
        "productName": "customer-chatbot",
        "trace_id": "session-123",
        "task_type": "chat"
    }
)

# Streaming
stream = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

# Embeddings
embedding = client.embeddings.create(
    model="text-embedding-3-small",
    input="The quick brown fox"
)

Azure OpenAI

The middleware automatically detects Azure OpenAI when using AzureOpenAI() and resolves deployment names to standard model names for accurate pricing.

from openai import AzureOpenAI
import revenium_middleware_openai

client = AzureOpenAI(
    azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
    api_key=os.getenv("AZURE_OPENAI_API_KEY"),
    api_version="2024-02-01"
)

response = client.chat.completions.create(
    model="my-gpt4-deployment",  # Azure deployment name
    messages=[{"role": "user", "content": "Hello!"}]
)
# Model name automatically resolved for pricing

Azure environment variables:

AZURE_OPENAI_ENDPOINT - Your Azure OpenAI endpoint
AZURE_OPENAI_API_KEY - Your Azure OpenAI API key
AZURE_OPENAI_DEPLOYMENT - Default deployment name

Examples: examples/openai/ - openai_basic.py, openai_streaming.py, azure_basic.py, azure_streaming.py

Anthropic

Supports messages, streaming, vision/multimodal, and AWS Bedrock integration.

from dotenv import load_dotenv
load_dotenv()

import anthropic
import revenium_middleware_anthropic  # Auto-initializes

client = anthropic.Anthropic()

# Basic message
message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=100,
    messages=[{"role": "user", "content": "Hello!"}],
    usage_metadata={
        "organizationName": "AcmeCorp",
        "productName": "support-bot",
        "trace_id": "session-456"
    }
)

# Streaming
with client.messages.stream(
    model="claude-opus-4-7",
    max_tokens=200,
    messages=[{"role": "user", "content": "Tell me a story"}],
    usage_metadata={"task_type": "creative"}
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

Note: The middleware only wraps messages.create and messages.stream endpoints. Other Anthropic SDK features work normally but aren't metered.

AWS Bedrock

The middleware provides complete AWS Bedrock integration with automatic detection.

import anthropic
import revenium_middleware_anthropic

# Bedrock is automatically detected when AWS credentials are available
# and base_url contains 'amazonaws.com'
client = anthropic.AnthropicBedrock(
    aws_region="us-east-1"
)

message = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=100,
    messages=[{"role": "user", "content": "Hello from Bedrock!"}]
)

Provider detection automatically routes between Bedrock and direct Anthropic API based on:

AWS credentials availability (aws configure, IAM roles, environment variables)
Base URL detection (when base_url contains amazonaws.com)
Defaults to direct Anthropic API - Bedrock only used when explicitly configured

Bedrock environment variables:

Variable	Description	Default
`AWS_REGION`	AWS region for Bedrock	`us-east-1`
`REVENIUM_BEDROCK_DISABLE`	Set to `1` to disable Bedrock support	Not set

AWS authentication uses the standard credential chain: environment variables, ~/.aws/credentials, IAM roles, AWS SSO. Required permissions: bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream.

Supported Bedrock models:

Anthropic Model	Bedrock Model ID
`claude-opus-4-7`	`anthropic.claude-opus-4-7`
`us.claude-opus-4-7`	`us.anthropic.claude-opus-4-7`
`eu.claude-opus-4-7`	`eu.anthropic.claude-opus-4-7`
`au.claude-opus-4-7`	`au.anthropic.claude-opus-4-7`
`global.claude-opus-4-7`	`global.anthropic.claude-opus-4-7`
`claude-3-opus-20240229`	`anthropic.claude-3-opus-20240229-v1:0`
`claude-3-sonnet-20240229`	`anthropic.claude-3-sonnet-20240229-v1:0`
`claude-3-haiku-20240307`	`us.anthropic.claude-3-5-haiku-20241022-v1:0`
`claude-3-5-sonnet-20240620`	`anthropic.claude-3-5-sonnet-20240620-v1:0`
`claude-3-5-sonnet-20241022`	`anthropic.claude-3-5-sonnet-20241022-v2:0`
`claude-3-5-haiku-20241022`	`anthropic.claude-3-5-haiku-20241022-v1:0`

For other models, the middleware uses the format anthropic.{model_name}.

Examples: examples/anthropic/ - anthropic-basic.py, anthropic-streaming.py, anthropic-bedrock.py, anthropic-advanced.py

Google AI (Gemini / Vertex AI)

Supports chat completions, streaming, embeddings, image generation (Imagen), video generation, and vision/multimodal. Choose between Google AI SDK (simple API key setup) or Vertex AI SDK (production-grade with full token counting).

# Google AI SDK only (Gemini Developer API)
pip install "revenium-python-sdk[google-genai]"

# Vertex AI SDK only (recommended for production)
pip install "revenium-python-sdk[google-vertex]"

Google AI SDK

from dotenv import load_dotenv
load_dotenv()

import revenium_middleware_google
from google import genai

client = genai.Client()
response = client.models.generate_content(
    model="gemini-2.0-flash-001",
    contents="Hello! Introduce yourself in one sentence.",
    usage_metadata={
        "organizationName": "AcmeCorp",
        "task_type": "chat"
    }
)
print(response.text)

Vertex AI SDK

from dotenv import load_dotenv
load_dotenv()

import revenium_middleware_google
import vertexai
from vertexai.generative_models import GenerativeModel

vertexai.init(project="your-gcp-project", location="us-central1")
model = GenerativeModel("gemini-2.0-flash-001")
response = model.generate_content("Hello!")
print(response.text)

Which SDK should I choose?

Use Case	Recommended SDK	Why
Quick prototyping	Google AI SDK	Simple API key setup
Production applications	Vertex AI SDK	Full token counting, enterprise features
Embeddings-heavy workloads	Vertex AI SDK	Complete token tracking for embeddings
Enterprise/GCP environments	Vertex AI SDK	Advanced Google Cloud integration

Note: Google AI SDK embeddings don't return token counts due to API limitations, but requests are still tracked.

Google AI environment variables:

GOOGLE_API_KEY - For Google AI SDK
GOOGLE_CLOUD_PROJECT - For Vertex AI SDK
GOOGLE_CLOUD_LOCATION - Vertex AI region (default: us-central1)

For Vertex AI, authenticate with: gcloud auth application-default login

Examples: examples/google/ - getting_started_google_ai.py, getting_started_vertex_ai.py, simple_streaming_test.py, simple_embeddings_test.py

Ollama

Supports chat completions, text generation, embeddings, and streaming. Works with any Ollama model.

from dotenv import load_dotenv
load_dotenv()

import ollama
import revenium_middleware_ollama  # Auto-initializes

# Chat completion
response = ollama.chat(
    model='qwen2.5:0.5b',
    messages=[{'role': 'user', 'content': 'Why is the sky blue?'}],
    usage_metadata={
        "organizationName": "AcmeCorp",
        "task_type": "chat"
    }
)
print(response['message']['content'])

# Streaming
for chunk in ollama.chat(
    model='qwen2.5:0.5b',
    messages=[{'role': 'user', 'content': 'Tell me a story'}],
    stream=True
):
    print(chunk['message']['content'], end='', flush=True)

# Text generation
response = ollama.generate(model='qwen2.5:0.5b', prompt='Once upon a time')

# Embeddings (single and batch)
response = ollama.embed(model='nomic-embed-text', input='Hello world')
response = ollama.embed(model='nomic-embed-text', input=['Text 1', 'Text 2', 'Text 3'])

Supported endpoints: ollama.chat(), ollama.generate(), ollama.embed()

OpenAI compatibility mode: You can also use Ollama with the OpenAI SDK:

import openai
import revenium_middleware_openai

openai.api_key = 'ollama'
openai.base_url = 'http://localhost:11434/v1/'

response = openai.chat.completions.create(
    model="gemma2:2b",
    messages=[{"role": "user", "content": "Hello!"}],
    usage_metadata={"organizationName": "AcmeCorp"}
)

Prerequisites: Ensure Ollama is running (ollama serve) before making API calls.

Examples: examples/ollama/ - getting_started.py, example_streaming.py, example_metadata.py, embeddings_example.py

LiteLLM

Supports all LLM providers available through LiteLLM with two integration patterns: client-side middleware and server-side proxy callbacks.

Client Mode

from dotenv import load_dotenv
load_dotenv()

import revenium_middleware_litellm_client.middleware  # Auto-initializes
import litellm
import os

litellm.api_base = os.getenv("LITELLM_PROXY_URL")
litellm.api_key = os.getenv("LITELLM_API_KEY")

response = litellm.completion(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}],
    usage_metadata={
        "organizationName": "AcmeCorp",
        "task_type": "chat"
    }
)

Proxy Mode

Add the callback to your LiteLLM config.yaml for server-side integration:

litellm_settings:
  callbacks: ["revenium_middleware_litellm_proxy.middleware.proxy_handler_instance"]

When using the LiteLLM proxy, pass metadata via HTTP headers (x-revenium-*).

LiteLLM Decorators

LiteLLM provides additional tracking decorators beyond the standard @revenium_metadata and @revenium_meter:

Decorator	Purpose
`@track_agent()`	Identify the AI agent
`@track_task()`	Classify the type of work
`@track_trace()`	Set trace ID for distributed tracing
`@track_organization()`	Track multi-tenant organizations
`@track_subscription()`	Track subscription-based billing
`@track_product()`	Track product-specific usage
`@track_subscriber()`	Identify end users
`@track_quality()`	Track response quality scores

All decorators support static values, extraction from function arguments (name_from_arg), or extraction from object attributes (name_from_attr).

CrewAI Integration

pip install "revenium-middleware-litellm[crewai]"

Pre-built wrapper for tracking CrewAI agent executions. Note: CrewAI requires Python 3.12 or earlier.

LiteLLM environment variables:

LITELLM_PROXY_URL - Your LiteLLM proxy URL
LITELLM_API_KEY - Your LiteLLM proxy API key

Examples: examples/litellm/ - getting_started.py, litellm_proxy_example.py, crewai_decorator_example.py

Perplexity

Supports both the OpenAI SDK (with Perplexity base URL) and the native Perplexity SDK, with streaming support.

Using OpenAI SDK

from dotenv import load_dotenv
load_dotenv()

from openai import OpenAI
import revenium_middleware_perplexity  # Auto-patches OpenAI

client = OpenAI(
    api_key=os.getenv("PERPLEXITY_API_KEY"),
    base_url="https://api.perplexity.ai"
)

response = client.chat.completions.create(
    model="sonar",
    messages=[{"role": "user", "content": "What is the capital of France?"}],
    usage_metadata={"organizationName": "AcmeCorp"}
)

Using Native Perplexity SDK

from perplexity import Perplexity
import revenium_middleware_perplexity  # Auto-patches Perplexity

client = Perplexity(api_key=os.getenv("PERPLEXITY_API_KEY"))

response = client.chat.completions.create(
    model="sonar",
    messages=[{"role": "user", "content": "Hello!"}]
)

Both approaches work identically - the middleware automatically detects which SDK you're using.

Streaming:

stream = client.chat.completions.create(
    model="sonar-pro",
    messages=[{"role": "user", "content": "Write a poem"}],
    stream=True,
    usage_metadata={"task_type": "creative_writing"}
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Examples: examples/perplexity/ - getting_started.py, basic.py, streaming.py, example_decorator.py

fal.ai

Supports image, video, and audio generation through fal.ai with automatic media type detection.

import revenium_middleware_fal  # Auto-activates
import fal_client

result = fal_client.subscribe(
    "fal-ai/flux/dev",
    arguments={
        "prompt": "A beautiful sunset over mountains",
        "image_size": "landscape_16_9"
    },
    usage_metadata={
        "organizationName": "AcmeCorp",
        "task_type": "image-generation"
    }
)

for image in result.get("images", []):
    print(f"Image URL: {image['url']}")

Supported methods: fal_client.run, fal_client.subscribe, fal_client.stream (and their async variants: run_async, subscribe_async, stream_async)

Media type detection: The middleware automatically detects the type of media being generated (image, video, audio) based on the application name for accurate cost tracking.

Environment variables:

FAL_KEY - Your fal.ai API key

LangChain

Callback handler that automatically tracks LLM calls, chains, tools, and agent actions.

from langchain_openai import ChatOpenAI
from revenium_middleware_langchain import ReveniumCallbackHandler

handler = ReveniumCallbackHandler(
    trace_id="session-123",
    agent_name="support_agent"
)

llm = ChatOpenAI(model="gpt-4o-mini", callbacks=[handler])
response = llm.invoke("Hello!")

With chains:

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

prompt = ChatPromptTemplate.from_template("Tell me a joke about {topic}")
chain = prompt | llm | output_parser
result = chain.invoke({"topic": "programming"})

With agents:

from langchain_core.tools import tool
from langgraph.prebuilt import create_react_agent

@tool
def get_weather(city: str) -> str:
    """Get the weather for a city."""
    return f"Sunny, 72F in {city}"

agent = create_react_agent(llm, [get_weather])
result = agent.invoke(
    {"messages": [HumanMessage(content="Weather in NYC?")]},
    config={"callbacks": [handler]}
)

Async support:

from revenium_middleware_langchain import AsyncReveniumCallbackHandler

handler = AsyncReveniumCallbackHandler(trace_id="async-session")
llm = ChatOpenAI(model="gpt-4o-mini", callbacks=[handler])
response = await llm.ainvoke("Hello!")

Supported providers: OpenAI, Anthropic, Google, AWS Bedrock, Azure OpenAI, Cohere, HuggingFace, Ollama. Provider is auto-detected from LangChain class name or model name prefix.

Programmatic configuration:

from revenium_middleware_langchain import ReveniumCallbackHandler, ReveniumConfig, SubscriberConfig

config = ReveniumConfig(
    api_key="hak_your_api_key",
    environment="production",
    organization_name="my_org",
    product_name="my_product",
    subscriber=SubscriberConfig(id="user_123", email="user@example.com"),
)

handler = ReveniumCallbackHandler(config=config, trace_id="session-123")

Metadata Fields

Add business context to any API call by passing a usage_metadata dictionary. All fields are optional.

Field	Description	Use Case
`trace_id`	Unique session or conversation identifier	Link multiple API calls together for debugging, session analytics, or distributed tracing
`task_type`	Type of AI task being performed	Categorize usage by workload (e.g., `"chat"`, `"code-generation"`, `"doc-summary"`) for cost analysis
`subscriber.id`	Unique user identifier	Track individual user consumption for billing, rate limiting, or analytics
`subscriber.email`	User email address	Identify users for support, compliance, or usage reports
`subscriber.credential.name`	Authentication credential name	Track which API key or service account made the request
`subscriber.credential.value`	Authentication credential value	Associate usage with specific credentials for security auditing
`organizationName`	Organization or company name	Multi-tenant cost allocation, usage quotas per organization. Auto-creates if not found
`subscription_id`	Subscription plan identifier	Track usage against subscription limits, identify plan upgrade opportunities
`productName`	Your product or feature name	Attribute AI costs to specific features (e.g., `"customer-chatbot"`, `"email-assistant"`). Auto-creates if not found
`agent`	AI agent or bot identifier	Distinguish between multiple AI agents or automation workflows
`response_quality_score`	Custom quality rating (0.0-1.0)	Track user satisfaction or automated quality metrics for model performance analysis

Example:

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}],
    usage_metadata={
        "trace_id": "conv-28a7e9d4",
        "task_type": "customer-support",
        "subscriber": {
            "id": "user-1234",
            "email": "user@example.com",
            "credential": {
                "name": "engineering-api-key",
                "value": "sk-1234567890abcdef"
            }
        },
        "organizationName": "AcmeCorp",
        "subscription_id": "pro-plan-Q1",
        "productName": "customer-support-chatbot",
        "agent": "support-agent",
        "response_quality_score": 0.92
    }
)

Deprecation notice: The legacy field aliases organizationId, organization_id, productId, and product_id are accepted by this SDK only as an input-layer convenience and emit a DeprecationWarning. The Revenium backend no longer accepts them — they are translated to organizationName / productName before the wire call. Migrate to organization_name / organizationName and product_name / productName now; the input-layer aliases will be removed in the next major release.

API Reference: Complete metadata field documentation

Trace Visualization & Distributed Tracing

Enhanced observability fields for tracking AI operations across environments, regions, and workflows. Fields can be set via environment variables (static/deployment-level defaults) or passed directly in usage_metadata (dynamic/per-request values). Direct values always take precedence.

Available Fields

Field	Environment Variable (Fallback)	Description	Use Case
`environment`	`REVENIUM_ENVIRONMENT` (auto-detects: `ENVIRONMENT`, `DEPLOYMENT_ENV`)	Deployment environment	Track usage across `production`, `staging`, `dev`
`region`	`REVENIUM_REGION` (auto-detects: `AWS_REGION`, `AZURE_REGION`, `GCP_REGION`)	Cloud region identifier	Multi-region deployment tracking and latency analysis
`credential_alias`	`REVENIUM_CREDENTIAL_ALIAS`	Human-readable API key name	Track which credential was used for rotation and auditing
`trace_type`	`REVENIUM_TRACE_TYPE`	Workflow category (max 128 chars, alphanumeric/hyphens/underscores)	Group similar workflows (e.g., `"customer-support"`, `"data-analysis"`)
`trace_name`	`REVENIUM_TRACE_NAME`	Human-readable trace label (max 256 chars)	Label trace instances (e.g., `"Customer Support Chat"`)
`parent_transaction_id`	`REVENIUM_PARENT_TRANSACTION_ID`	Parent transaction ID	Link child operations to parents across microservices
`transaction_name`	`REVENIUM_TRANSACTION_NAME`	Human-friendly operation name	Label operations (e.g., `"Generate Response"`, `"Analyze Sentiment"`)
`retry_number`	`REVENIUM_RETRY_NUMBER`	Retry attempt number (0 = first attempt)	Track retry attempts for failed operations

Note: operation_type (e.g., CHAT, EMBED, TOOL_CALL) and operation_subtype (e.g., function_call, streaming) are automatically detected by the middleware and cannot be overridden.

Usage

Static fields via environment variables (deployment-level defaults):

# .env file
REVENIUM_ENVIRONMENT=production
REVENIUM_REGION=us-east-1
REVENIUM_CREDENTIAL_ALIAS=prod-openai-key
REVENIUM_TRACE_TYPE=customer-support

Dynamic fields via usage_metadata (per-request values):

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}],
    usage_metadata={
        "environment": "production",
        "region": "us-east-1",
        "trace_type": "customer-support",
        "trace_name": "Support Chat Session",
        "transaction_name": "Generate Response",
        "parent_transaction_id": "parent-txn-123"
    }
)

Best practice: Use environment variables for static deployment configuration (environment, region, credential_alias) and pass dynamic values (trace_name, transaction_name, organizationName) directly in usage_metadata or via decorators.

Distributed Tracing Example

import uuid

workflow_id = str(uuid.uuid4())

# Step 1: Parent operation
parent_response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Analyze this document"}],
    usage_metadata={
        "trace_id": "analysis-session-456",
        "transaction_name": "Document Analysis",
        "task_type": "analysis"
    }
)

# Step 2: Child operation linked to parent
child_response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Summarize findings"}],
    usage_metadata={
        "trace_id": "analysis-session-456",
        "parent_transaction_id": parent_response.id,
        "transaction_name": "Summarize Results",
        "task_type": "summarization"
    }
)

Decorator Support

`@revenium_metadata` - Automatic Metadata Injection

Automatically injects metadata into all API calls within a function's scope. Eliminates the need to pass usage_metadata to every API call.

from revenium_middleware import revenium_metadata

@revenium_metadata(
    trace_id="session-12345",
    task_type="customer-support",
    organizationName="AcmeCorp",
    environment="production"
)
def handle_customer_query(question: str) -> str:
    # All API calls automatically include the decorator metadata
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": question}]
    )
    return response.choices[0].message.content

answer = handle_customer_query("How do I reset my password?")

Features:

DRY Principle: Define metadata once, apply to all API calls in the function
Composable: Decorators can be nested - inner decorators inherit and override outer ones
API-level override: usage_metadata passed directly to API calls always takes precedence over decorator metadata
Async support: Works with both sync and async functions
Thread-safe: Uses contextvars for proper isolation

Nested decorators (metadata merging):

@revenium_metadata(organizationName="AcmeCorp", environment="production")
def outer_function():
    # Gets: organizationName, environment
    response1 = client.chat.completions.create(...)

    @revenium_metadata(trace_id="inner-trace", task_type="analysis")
    def inner_function():
        # Gets: organizationName, environment (inherited) + trace_id, task_type (added)
        response2 = client.chat.completions.create(...)
        return response2

    return inner_function()

API-level override:

@revenium_metadata(organizationName="AcmeCorp", task_type="default")
def mixed_metadata():
    # Uses decorator metadata
    response1 = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Hello"}]
    )

    # API-level metadata overrides decorator's task_type
    response2 = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Hello"}],
        usage_metadata={
            "task_type": "special-override",  # Overrides decorator
            "trace_id": "api-level-trace"     # Adds new field
            # organizationName still inherited from decorator
        }
    )

`@revenium_meter` - Selective Metering

Control which functions are metered when selective metering mode is enabled. This is useful for metering only specific high-value operations while ignoring others.

Note: This decorator only has an effect when REVENIUM_SELECTIVE_METERING=true is set. By default, all API calls are metered automatically.

# Enable selective metering
export REVENIUM_SELECTIVE_METERING=true

from revenium_middleware import revenium_meter, revenium_metadata

@revenium_meter()
@revenium_metadata(task_type="premium-feature", organizationName="PremiumTier")
def premium_feature(prompt: str) -> str:
    # This WILL be metered (decorated with @revenium_meter)
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}]
    )
    return response.choices[0].message.content

def free_feature(prompt: str) -> str:
    # This will NOT be metered (no @revenium_meter decorator)
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}]
    )
    return response.choices[0].message.content

Accepted values for REVENIUM_SELECTIVE_METERING:

"true", "1", "yes", "on" (case-insensitive) - Selective metering enabled
"false", "0", "no", "off", or unset - All calls metered (default)

Decorator order matters: Place @revenium_meter before @revenium_metadata (outer to inner).

Tool Metering

The @meter_tool decorator lets you meter arbitrary tool/function calls (web scrapers, database lookups, API fetchers, image generators, etc.) alongside your automatic LLM API metering. Available via revenium_metering v6.8.2+.

import os
from revenium_middleware import meter_tool, configure

# Configure the metering client for tool calls
configure(
    metering_url=os.getenv("REVENIUM_METERING_BASE_URL", "https://api.revenium.ai"),
    api_key=os.environ["REVENIUM_METERING_API_KEY"],
)

# Decorate any tool function to automatically meter it
@meter_tool("customer-database", operation="lookup", agent="support-bot")
def lookup_customer(customer_id: str) -> dict:
    """Timing and success/failure are automatically tracked."""
    return {"name": "Jane Smith", "plan": "Enterprise"}

# The decorator reports the tool call to Revenium automatically
result = lookup_customer("CUST-42")

Manual reporting:

from revenium_middleware import report_tool_call

report_tool_call(
    tool_id="my-tool",
    operation="fetch",
    duration_ms=1234,
    success=True,
    usage_metadata={"records": 42},
)

Prompt Capture

Optional capture of prompts and responses for analytics and debugging. Disabled by default to protect sensitive data.

Enable

export REVENIUM_CAPTURE_PROMPTS=true

What Gets Captured

Field	Description	Source
`system_prompt`	System prompt content	From `system` parameter / system message
`input_messages`	User/assistant messages as JSON	From `messages` parameter
`output_response`	Assistant's response content	From response content blocks
`prompts_truncated`	Truncation flag	Set to `true` if any field exceeded 50,000 characters

Each field has a maximum length of 50,000 characters. If exceeded, it's truncated with a ...[TRUNCATED] marker.

Example

import os
os.environ["REVENIUM_CAPTURE_PROMPTS"] = "true"

import revenium_middleware_openai
from openai import OpenAI

client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of France?"}
    ],
    usage_metadata={"organizationName": "DemoOrg"}
)
# System prompt, input messages, and output response are now captured

Prompt capture works with both streaming and non-streaming requests, and with multimodal content (text, images, etc.).

Security Considerations

Prompts may contain sensitive user data
Responses may include confidential information
Only enable in environments where data capture is appropriate
Ensure compliance with your data privacy policies
Use selective metering with @revenium_meter to control which calls are captured

Cost Controls / Enforcement

Block outbound provider requests client-side when a Revenium cost control trips. When the circuit breaker is enabled, the middleware polls compiled enforcement rules from the Revenium API in a background daemon thread and raises BudgetExceededError before the upstream call, preventing spend beyond the configured limit.

Terminology note: The customer-facing entity is called a cost control, served by the backend at /v2/api/ai/cost-controls. This SDK polls a separate compiled-rules feed at /v2/api/ai/enforcement-rules/{teamId} and is unaffected by changes to the CRUD path — no SDK upgrade is required.

Currently wired for the OpenAI provider (other providers land via per-provider follow-on tickets).

Enable

pip install 'revenium-python-sdk[openai]'

REVENIUM_CIRCUIT_BREAKER_ENABLED=true
REVENIUM_METERING_API_KEY=hak_your_key_here
REVENIUM_TEAM_ID=your_hashed_team_id
REVENIUM_ENFORCEMENT_BASE_URL=https://api.revenium.ai/profitstream  # optional

Environment Variables

Variable	Default	Description
`REVENIUM_CIRCUIT_BREAKER_ENABLED`	`false`	Master switch. `true` / `1` / `yes` / `on` to enable.
`REVENIUM_BYPASS`	`false`	When `true`, every `check_enforcement` call short-circuits to a no-op. Useful for incident response.
`REVENIUM_TEAM_ID`	—	Hashed team ID. Path component on rule fetches; required when the breaker is enabled.
`REVENIUM_ENFORCEMENT_BASE_URL`	origin of `REVENIUM_METERING_BASE_URL`	Base URL for the enforcement API. Set when the enforcement API lives behind a context-path.
`REVENIUM_CB_POLL_INTERVAL_SECONDS`	`60`	Background poll interval for rule refreshes.
`REVENIUM_CB_FAIL_MODE`	`open`	`open` (default) lets calls through when no cache exists; `closed` raises `BudgetExceededError` until rules are loaded.
`REVENIUM_CACHE_DIR`	—	When set, the rule cache is mirrored to `<dir>/revenium_enforcement_rules.json` so a restarted process doesn't fail-closed on the very first call.

Public API

Enforcement auto-initializes when the OpenAI middleware loads:

import revenium_middleware.openai  # auto-instruments openai
import openai

client = openai.OpenAI()

The pre-call check fires before every chat / embeddings / responses call. When the circuit breaker is disabled, it is a no-op. When enabled:

A daemon thread (revenium-enforcement-poll) starts on first use.
It polls GET {REVENIUM_ENFORCEMENT_BASE_URL}/v2/api/ai/enforcement-rules/{REVENIUM_TEAM_ID} every REVENIUM_CB_POLL_INTERVAL_SECONDS with the x-api-key header.
Rules are cached in-process (120 s TTL, refresh-on-stale with thundering-herd guard).
204 No Content is treated as "no rules configured" — the cache is cleared.

Exception Contract

from revenium_middleware.openai import BudgetExceededError

When a tripped rule matches the current request, the middleware raises before the OpenAI call is made. All structured fields are populated when the server provides them:

Attribute	Type	Description
`message`	`str`	Human-readable reason, e.g. `"Request blocked by Revenium enforcement rule: monthly-gpt4-cap"`
`rule_name`	`str \| None`	Server-side rule name
`current_value`	`float \| None`	Current metric value at the time of the block
`threshold`	`float \| None`	Configured limit
`resets_at`	`str \| None`	ISO-8601 timestamp the rule next resets
`rule_id`	`str \| int \| None`	Server-side rule identifier

BudgetExceededError does not inherit from ReveniumMiddlewareError, so the OpenAI middleware's handle_exception_safely decorator never swallows it — it always reaches your except block.

from revenium_middleware.openai import BudgetExceededError
import openai

client = openai.OpenAI()

try:
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Summarize the meeting notes"}],
    )
except BudgetExceededError as exc:
    print(f"Cost limit reached: {exc.message}")
    print(f"Rule {exc.rule_name}: {exc.current_value} / {exc.threshold}; resets {exc.resets_at}")

Fail-Open vs Fail-Closed

By default (REVENIUM_CB_FAIL_MODE=open) enforcement failures never propagate to user code. If the rule fetch errors (network, 5xx, auth), the previous in-memory cache is preserved and a debug log line is emitted. If there is no cache yet, enforcement behaves as if no rules are configured and the request continues.

Set REVENIUM_CB_FAIL_MODE=closed to refuse calls until at least one rule fetch (or REVENIUM_CACHE_DIR snapshot) succeeds. Pair with REVENIUM_CACHE_DIR so a process restart loads the last-known rules rather than blocking every call until the first poll completes.

Shadow Mode

Rules with shadowMode: true are observe-and-log: they are skipped by check_enforcement. Use shadow mode on the server side to audit a rule before flipping it to enforce.

End-to-End Example

See examples/openai/openai_blocking_demo.py for a runnable end-to-end demo using a seeded budget rule.

Configuration Reference

Required Environment Variables

Variable	Description
`REVENIUM_METERING_API_KEY`	Your Revenium API key (starts with `hak_` or `rev_`)

Optional Environment Variables

Variable	Default	Description
`REVENIUM_METERING_BASE_URL`	`https://api.revenium.ai`	Revenium API endpoint
`REVENIUM_LOG_LEVEL`	`INFO`	Log level: `DEBUG`, `INFO`, `WARNING`, `ERROR`, `CRITICAL`
`REVENIUM_CAPTURE_PROMPTS`	`false`	Enable prompt capture
`REVENIUM_SELECTIVE_METERING`	`false`	Only meter `@revenium_meter` decorated functions
`REVENIUM_TEAM_ID`	-	Team ID for cost lookups
`REVENIUM_ENVIRONMENT`	-	Deployment environment (auto-detects from `ENVIRONMENT`, `DEPLOYMENT_ENV`)
`REVENIUM_REGION`	-	Cloud region (auto-detects from `AWS_REGION`, `AZURE_REGION`, `GCP_REGION`)
`REVENIUM_CREDENTIAL_ALIAS`	-	Human-readable API key name
`REVENIUM_TRACE_TYPE`	-	Workflow category identifier
`REVENIUM_TRACE_NAME`	-	Human-readable trace label
`REVENIUM_PARENT_TRANSACTION_ID`	-	Parent transaction ID for distributed tracing
`REVENIUM_TRANSACTION_NAME`	-	Human-friendly operation name
`REVENIUM_RETRY_NUMBER`	-	Retry attempt number
`REVENIUM_BEDROCK_DISABLE`	-	Set to `1` to disable Bedrock auto-detection

Provider-Specific Environment Variables

Variable	Provider	Description
`OPENAI_API_KEY`	OpenAI	OpenAI API key
`AZURE_OPENAI_ENDPOINT`	Azure OpenAI	Azure endpoint URL
`AZURE_OPENAI_API_KEY`	Azure OpenAI	Azure API key
`AZURE_OPENAI_DEPLOYMENT`	Azure OpenAI	Default deployment name
`ANTHROPIC_API_KEY`	Anthropic	Anthropic API key
`AWS_REGION`	Bedrock	AWS region for Bedrock (default: `us-east-1`)
`GOOGLE_API_KEY`	Google AI	Google AI SDK API key
`GOOGLE_CLOUD_PROJECT`	Vertex AI	GCP project ID
`GOOGLE_CLOUD_LOCATION`	Vertex AI	GCP location (default: `us-central1`)
`PERPLEXITY_API_KEY`	Perplexity	Perplexity API key
`FAL_KEY`	fal.ai	fal.ai API key
`LITELLM_PROXY_URL`	LiteLLM	LiteLLM proxy URL
`LITELLM_API_KEY`	LiteLLM	LiteLLM proxy API key

Troubleshooting

Issue	Solution
Middleware not working	Verify `REVENIUM_METERING_API_KEY` is set correctly (must start with `hak_` or `rev_`)
No data in dashboard	Enable debug logging with `REVENIUM_LOG_LEVEL=DEBUG`
Import errors	Ensure the correct extra is installed (e.g., `pip install revenium-python-sdk[openai]`)
Azure: wrong model name	Middleware auto-resolves deployment names; check with debug logging
Bedrock: AccessDenied	Ensure `bedrock:InvokeModel` and `bedrock:InvokeModelWithResponseStream` permissions
Bedrock: requests go to Anthropic	Verify AWS credentials: `aws sts get-caller-identity`
Google: embeddings show 0 tokens	Expected with Google AI SDK; use Vertex AI for full token counting
Google: "No module named 'vertexai'"	Install correct extra: `pip install "revenium-python-sdk[google-vertex]"`
Vertex AI: authentication errors	Run `gcloud auth application-default login`
Ollama: connection errors	Ensure Ollama is running: `ollama serve`
LangChain: provider shows "unknown"	Ensure you're using a supported LangChain LLM class
Streaming errors	Check provider credentials; middleware auto-falls back gracefully

Debug mode: Set REVENIUM_LOG_LEVEL=DEBUG to see detailed provider detection, routing decisions, and metering payloads.

Force direct Anthropic API: Set REVENIUM_BEDROCK_DISABLE=1 to disable Bedrock auto-detection.

Check initialization status: Use revenium_middleware_<provider>.is_initialized() to verify setup.

Logging

This module uses Python's standard logging system. Control the log level with the REVENIUM_LOG_LEVEL environment variable:

# Enable debug logging
export REVENIUM_LOG_LEVEL=DEBUG

# Or when running your script
REVENIUM_LOG_LEVEL=DEBUG python your_script.py

Available log levels:

DEBUG: Detailed debugging information (provider detection, routing decisions, metering payloads)
INFO: General information (default)
WARNING: Warning messages only
ERROR: Error messages only
CRITICAL: Critical error messages only

Compatibility

Python 3.8+
Works with all supported AI provider SDKs (latest versions recommended)
Thread-safe and production-ready for concurrent applications

Documentation

For detailed documentation, visit docs.revenium.io

Server-Side Cost Controls

Cost controls (spend limits, throttling, alerts) are managed server-side in Revenium, not in this SDK. The SDK reports usage; Revenium evaluates it against your configured cost controls.

The cost-controls API endpoint is /v2/api/ai/cost-controls. This Python SDK does not call the endpoint directly — no SDK changes are required to use cost controls. If you manage cost controls via the Revenium API, HTTP client, or curl, see docs.revenium.io for the current API reference.

Contributing

See CONTRIBUTING.md

Code of Conduct

See CODE_OF_CONDUCT.md

Security

See SECURITY.md

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

For issues, feature requests, or contributions:

Website: www.revenium.ai
GitHub Repository: revenium/revenium-python-sdk
Issues: Report bugs or request features
Documentation: docs.revenium.io
Email: support@revenium.io

Built by Revenium

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.7

May 28, 2026

This version

0.1.6

May 22, 2026

0.1.5

May 21, 2026

0.1.4

May 8, 2026

0.1.3

Apr 28, 2026

0.1.2

Apr 9, 2026

0.1.1

Mar 17, 2026

0.1.0

Mar 5, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

revenium_python_sdk-0.1.6.tar.gz (180.8 kB view details)

Uploaded May 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

revenium_python_sdk-0.1.6-py3-none-any.whl (186.6 kB view details)

Uploaded May 22, 2026 Python 3

File details

Details for the file revenium_python_sdk-0.1.6.tar.gz.

File metadata

Download URL: revenium_python_sdk-0.1.6.tar.gz
Upload date: May 22, 2026
Size: 180.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for revenium_python_sdk-0.1.6.tar.gz
Algorithm	Hash digest
SHA256	`a45424ed45cbb4aa9962905b67cecd6d9982be6270a62fc7a8c72d5acbf39ff9`
MD5	`7b9a9a868202ca8dbe0edfb6f5c2f2a0`
BLAKE2b-256	`23cf85edc76d63cc3053fb1d52497fa749c181fec9c82d5fa7052978efef08e1`

See more details on using hashes here.

File details

Details for the file revenium_python_sdk-0.1.6-py3-none-any.whl.

File metadata

Download URL: revenium_python_sdk-0.1.6-py3-none-any.whl
Upload date: May 22, 2026
Size: 186.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for revenium_python_sdk-0.1.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7c9a812f33e1cb92ac8b54dcd689354f2f31453076a3940651a4671fd7c68b0a`
MD5	`a220ac644e9477f85fe02111a17d3261`
BLAKE2b-256	`111d522a7170df00a8d0512cb43b7914bff28a126c3e22f47de5014dcce3cb91`

See more details on using hashes here.

revenium-python-sdk 0.1.6

Navigation

Verified details

Owner

Unverified details

Project links

Meta

Classifiers

Project description

Revenium Python SDK

Features

Supported Providers

Feature Matrix

Installation

Quick Start

1. Configure Environment Variables

2. Import and Use

Agentic Outcomes (Outcome-Based Metering)

Provider Usage Guides

OpenAI

Azure OpenAI

Anthropic

AWS Bedrock

Google AI (Gemini / Vertex AI)

Google AI SDK

Vertex AI SDK

Ollama

LiteLLM

Client Mode

Proxy Mode

LiteLLM Decorators

CrewAI Integration

Perplexity

Using OpenAI SDK

Using Native Perplexity SDK

fal.ai

LangChain

Metadata Fields

Trace Visualization & Distributed Tracing

Available Fields

Usage

Distributed Tracing Example

Decorator Support

@revenium_metadata - Automatic Metadata Injection

@revenium_meter - Selective Metering

Tool Metering

Prompt Capture

Enable

What Gets Captured

Example

Security Considerations

Cost Controls / Enforcement

Enable

Environment Variables

Public API

Exception Contract

Fail-Open vs Fail-Closed

Shadow Mode

End-to-End Example

Configuration Reference

Required Environment Variables

Optional Environment Variables

Provider-Specific Environment Variables

Troubleshooting

Logging

Compatibility

Documentation

Server-Side Cost Controls

Contributing

Code of Conduct

Security

License

Support

Project details

Verified details

Owner

Unverified details

Project links

Meta

Classifiers

`@revenium_metadata` - Automatic Metadata Injection

`@revenium_meter` - Selective Metering