Professional Python SDK for Infinium API
Project description
Infinium Python SDK
The official Python SDK for the Infinium agent observability platform. Send trace data from your AI agents to Infinium, where Maestro (our behavioral intelligence engine) analyzes and scores agent performance.
Your agent does work -- calls LLMs, hits APIs, processes data. The SDK captures what happened (steps taken, tokens used, errors hit, time elapsed) and ships it to Infinium. Maestro then grades the work. The agent never judges itself; it just reports facts.
Installation
pip install infinium-o2
Requirements: Python 3.9+
Quick Start
from infinium import InfiniumClient
client = InfiniumClient(
agent_id="your-agent-id",
agent_secret="your-agent-secret",
)
Send your first trace:
client.send_task(
name="Classify support ticket",
description="Categorized an inbound support ticket by type and urgency.",
duration=2.4,
input_summary="Customer ticket about billing issue",
output_summary="Category: billing, Urgency: high",
llm_usage={"model": "gpt-4o", "provider": "openai", "prompt_tokens": 320, "completion_tokens": 45},
)
Auto-Instrumentation
The fastest way to get full observability. watch() patches your LLM client to automatically capture every call, and @client.trace() wraps your function to bundle everything into a trace:
from openai import OpenAI
from infinium import InfiniumClient
from infinium.integrations import watch
client = InfiniumClient(agent_id="...", agent_secret="...")
openai = watch(OpenAI()) # Patches this instance to auto-capture LLM calls
@client.trace("Summarize Article")
def summarize(article: str) -> str:
resp = openai.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "Summarize this article in 3 bullet points."},
{"role": "user", "content": article},
],
)
return resp.choices[0].message.content
# LLM tokens, latency, and timing are captured automatically -- trace is auto-sent
result = summarize("The Federal Reserve announced today...")
Supported Providers
| Provider | Client | Detection |
|---|---|---|
| OpenAI | openai.OpenAI / openai.AsyncOpenAI |
Module name |
| Anthropic | anthropic.Anthropic / anthropic.AsyncAnthropic |
Module name |
| Google Gemini | google.generativeai.GenerativeModel |
Class name |
| xAI (Grok) | openai.OpenAI(base_url="https://api.x.ai/...") |
Base URL |
All providers support streaming transparently -- chunk accumulation and token extraction happen automatically.
Content Capture
By default, watch() only captures metadata (model, tokens, latency). To also capture input/output previews (truncated to 500 chars):
openai = watch(OpenAI(), capture_content=True)
Sending Traces
There are three ways to send traces, from simplest to most detailed:
1. send_task() -- Simple, Inline
Best for single-step agents. Call your LLM, then report what happened:
import time
from openai import OpenAI
openai = OpenAI()
start = time.time()
response = openai.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": ticket_text}],
)
duration = time.time() - start
usage = response.usage
client.send_task(
name="Classify support ticket",
description="Categorized an inbound support ticket.",
duration=duration,
llm_usage={
"model": "gpt-4o", "provider": "openai",
"prompt_tokens": usage.prompt_tokens,
"completion_tokens": usage.completion_tokens,
},
customer={"customer_name": customer_name},
)
2. TaskData -- Structured, Multi-Step
When your agent runs multiple steps, build a TaskData with execution details. This gives Maestro the most signal:
from infinium import InfiniumClient, ExecutionStep, LlmCall, LlmUsage, ExpectedOutcome
steps = []
# Step 1: Search
steps.append(ExecutionStep(
step_number=1, action="tool_use",
description="Search knowledge base",
duration_ms=340,
output_preview=f"Found {len(results)} results",
))
# Step 2: Analyze with LLM
steps.append(ExecutionStep(
step_number=2, action="llm_inference",
description="Analyze search results",
duration_ms=1200,
llm_call=LlmCall(model="gpt-4o", provider="openai", prompt_tokens=800, completion_tokens=200),
))
td = InfiniumClient.create_task_data(
name="Research: quarterly earnings",
description="Multi-step research with source gathering and analysis",
duration=3.2,
steps=steps,
expected_outcome=ExpectedOutcome(
task_objective="Produce sourced analysis of quarterly earnings",
required_deliverables=["Source list", "Analysis"],
),
llm_usage=LlmUsage(model="gpt-4o", provider="openai", prompt_tokens=800, completion_tokens=200, api_calls_count=1),
)
client.send_task_data(td)
3. TraceBuilder -- Recommended for Real Agents
Steps are context managers that auto-capture timing and exceptions:
from infinium import TraceBuilder, LlmUsage
trace = TraceBuilder("Blog Post Generator", "Generate a publish-ready blog post.")
trace.set_expected_outcome(
task_objective="Produce a 600-word blog post with SEO metadata",
required_deliverables=["Blog post", "SEO title", "Meta description"],
)
trace.set_input_summary(f"Topic: {topic}")
with trace.step("llm_inference", "Create outline") as step:
resp = openai.chat.completions.create(model="gpt-4o", messages=[...])
outline = resp.choices[0].message.content
step.set_output(outline[:500])
step.record_llm_call("gpt-4o", provider="openai",
prompt_tokens=resp.usage.prompt_tokens,
completion_tokens=resp.usage.completion_tokens)
with trace.step("llm_inference", "Write full post") as step:
resp = openai.chat.completions.create(model="gpt-4o", messages=[...])
blog_post = resp.choices[0].message.content
step.set_output(blog_post[:500])
step.record_llm_call("gpt-4o", provider="openai",
prompt_tokens=resp.usage.prompt_tokens,
completion_tokens=resp.usage.completion_tokens)
trace.set_output_summary(f"Generated {len(blog_post.split())}-word blog post.")
task_data = trace.build() # Auto-computes total duration
client.send_task_data(task_data)
@trace_agent Decorator
For single-function agents, the decorator handles everything -- duration, input/output capture, error recording, and sending:
from infinium import trace_agent
@trace_agent("Email Classifier", client)
def classify_email(email_body: str) -> dict:
resp = openai.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "Classify this email. Return JSON with intent, priority, department."},
{"role": "user", "content": email_body},
],
response_format={"type": "json_object"},
)
return json.loads(resp.choices[0].message.content)
# Every call auto-sends a trace
result = classify_email("Hi, I was charged twice for my subscription...")
If the function raises, the error is captured in the trace, the trace is still sent, and the exception re-raised. Use @async_trace_agent for async functions, or @client.trace() which auto-detects both.
Prompt Studio
Retrieve managed prompts with optional variable substitution:
prompt = client.get_prompt(
prompt_id="your-prompt-id",
prompt_key="your-prompt-key",
version="latest",
variables={"customer_name": "Acme Corp", "tone": "professional"},
)
print(prompt.content) # Raw template
print(prompt.rendered_content) # With variables substituted
When used inside a @trace_agent or @client.trace() decorated function, prompt fetches are automatically recorded in the trace.
Async Support
Every feature has a sync and async variant:
import asyncio
from infinium import AsyncInfiniumClient
async def main():
async with AsyncInfiniumClient(agent_id="...", agent_secret="...") as client:
await client.send_task(
name="Moderate content",
description="Check user content for policy violations.",
duration=1.8,
)
asyncio.run(main())
Batch Sending
Send multiple traces at once:
tasks = [InfiniumClient.create_task_data(...) for doc in documents]
result = client.send_tasks_batch(tasks)
print(f"Sent {result.successful}/{result.successful + result.failed} traces")
Polling for Maestro Results
After sending a trace, poll until Maestro finishes analysis:
response = client.send_task_data(task_data)
trace_id = response.data.get("traceId")
interpretation = client.wait_for_interpretation(trace_id, timeout=120, poll_interval=3)
print(interpretation.data["interpretedTraceResult"])
Configuration
client = InfiniumClient(
agent_id="your-agent-id",
agent_secret="your-agent-secret",
base_url="https://platform.i42m.ai/api/v1", # API endpoint (default)
timeout=30.0, # Request timeout in seconds
max_retries=3, # Retry attempts (exponential backoff)
enable_rate_limiting=True, # Client-side rate limiting
requests_per_second=10.0, # Rate limit threshold
verify_ssl=True, # Set False for self-signed certs
)
Error Handling
from infinium.exceptions import (
AuthenticationError, RateLimitError, ValidationError,
NetworkError, ServerError, InfiniumTimeoutError,
)
try:
client.send_task_data(task_data)
except AuthenticationError:
print("Bad credentials -- check agent_id and agent_secret")
except RateLimitError as e:
print(f"Rate limited -- retry after {e.retry_after}s")
except ValidationError as e:
print(f"Invalid input: {e.field} -- {e}")
except NetworkError:
print("Cannot reach the API")
except InfiniumTimeoutError:
print("Request timed out")
The SDK automatically retries on 5xx errors, network failures, and timeouts (up to max_retries). Auth, validation, and rate limit errors are not retried.
Data Types
Trace Enrichment
| Type | Purpose |
|---|---|
ExecutionStep |
One thing the agent did (tool call, LLM call, decision) |
ErrorDetail |
A factual error record (type, message, recoverable?) |
LlmCall |
A single LLM invocation (model, tokens, latency) |
ToolCall |
A tool/API invocation (name, status code, duration) |
ExpectedOutcome |
What the agent was asked to produce (Maestro's grading rubric) |
LlmUsage |
Aggregate token/cost stats across the whole trace |
EnvironmentContext |
Runtime info (framework, Python version, SDK version) |
Domain Sections
Attach context relevant to the agent's domain via keyword arguments or set_section():
| Section | Use Case |
|---|---|
Customer |
Customer name, email, company |
Sales |
Lead source, deal value, pipeline stage |
Support |
Ticket ID, issue type, resolution |
Research |
Topic, sources consulted, key findings |
Development |
Language, framework, test coverage |
Marketing |
Campaign, channel, target audience |
Content |
Content type, word count, SEO data |
Executive |
Meeting agenda, action items, decisions |
Project |
Deliverables, stakeholders, blockers |
General |
Tools used, tags, notes |
Design Principle
The agent reports facts. Maestro evaluates. The SDK never asks the agent to score itself, say pass/fail, or rate its own confidence. It captures what happened -- timestamps, token counts, error types, what was produced -- and Maestro decides if the work was good.
Documentation
For in-depth guides, API reference, and advanced usage, see the full documentation.
Development
git clone <repository-url>
cd infinium-python-sdk
pip install -e ".[dev]"
pytest
Support
- Create an issue in the project repository
- Contact support at support@i42m.ai
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file infinium_o2-1.1.0.tar.gz.
File metadata
- Download URL: infinium_o2-1.1.0.tar.gz
- Upload date:
- Size: 68.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
afca21c7192606ef467254a69b4b178d3174d93f992c587a6e7d772970a3385d
|
|
| MD5 |
424b10179ec1017d88e8de0b24c1b3b7
|
|
| BLAKE2b-256 |
603830b3cdebbc23bf0af818e5a2464d228f936705c536c692c02817892e720b
|
File details
Details for the file infinium_o2-1.1.0-py3-none-any.whl.
File metadata
- Download URL: infinium_o2-1.1.0-py3-none-any.whl
- Upload date:
- Size: 47.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
afe46f8033aa449ca4a489a63e58fe2a3f0be327c08d808fc224737e3b1f7fbf
|
|
| MD5 |
f4b0606db05b9878a01ed5aa8bfe8829
|
|
| BLAKE2b-256 |
40d650c051e8c59720df7ab7a2355df3b076bdc4673b23577e482cd53e42b889
|