Skip to main content

Professional Python SDK for Infinium API

Project description

Infinium Python SDK

The official Python SDK for the Infinium agent observability platform. Send trace data from your AI agents to Infinium, where Maestro (our behavioral intelligence engine) analyzes and scores agent performance.

Your agent does work -- calls LLMs, hits APIs, processes data. The SDK captures what happened (steps taken, tokens used, errors hit, time elapsed) and ships it to Infinium. Maestro then grades the work. The agent never judges itself; it just reports facts.

Installation

pip install infinium-o2

Requirements: Python 3.9+

Quick Start

from infinium import InfiniumClient

client = InfiniumClient(
    agent_id="your-agent-id",
    agent_secret="your-agent-secret",
)

Send your first trace:

client.send_task(
    name="Classify support ticket",
    description="Categorized an inbound support ticket by type and urgency.",
    duration=2.4,
    input_summary="Customer ticket about billing issue",
    output_summary="Category: billing, Urgency: high",
    llm_usage={"model": "gpt-4o", "provider": "openai", "prompt_tokens": 320, "completion_tokens": 45},
)

Auto-Instrumentation

The fastest way to get full observability. watch() patches your LLM client to automatically capture every call, and @client.trace() wraps your function to bundle everything into a trace:

from openai import OpenAI
from infinium import InfiniumClient
from infinium.integrations import watch

client = InfiniumClient(agent_id="...", agent_secret="...")
openai = watch(OpenAI())  # Patches this instance to auto-capture LLM calls

@client.trace("Summarize Article")
def summarize(article: str) -> str:
    resp = openai.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "Summarize this article in 3 bullet points."},
            {"role": "user", "content": article},
        ],
    )
    return resp.choices[0].message.content

# LLM tokens, latency, and timing are captured automatically -- trace is auto-sent
result = summarize("The Federal Reserve announced today...")

Supported Providers

Provider Client Detection
OpenAI openai.OpenAI / openai.AsyncOpenAI Module name
Anthropic anthropic.Anthropic / anthropic.AsyncAnthropic Module name
Google Gemini google.generativeai.GenerativeModel Class name
xAI (Grok) openai.OpenAI(base_url="https://api.x.ai/...") Base URL

All providers support streaming transparently -- chunk accumulation and token extraction happen automatically.

Content Capture

By default, watch() only captures metadata (model, tokens, latency). To also capture input/output previews (truncated to 500 chars):

openai = watch(OpenAI(), capture_content=True)

Sending Traces

There are three ways to send traces, from simplest to most detailed:

1. send_task() -- Simple, Inline

Best for single-step agents. Call your LLM, then report what happened:

import time
from openai import OpenAI

openai = OpenAI()
start = time.time()
response = openai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": ticket_text}],
)
duration = time.time() - start
usage = response.usage

client.send_task(
    name="Classify support ticket",
    description="Categorized an inbound support ticket.",
    duration=duration,
    llm_usage={
        "model": "gpt-4o", "provider": "openai",
        "prompt_tokens": usage.prompt_tokens,
        "completion_tokens": usage.completion_tokens,
    },
    customer={"customer_name": customer_name},
)

2. TaskData -- Structured, Multi-Step

When your agent runs multiple steps, build a TaskData with execution details. This gives Maestro the most signal:

from infinium import InfiniumClient, ExecutionStep, LlmCall, LlmUsage, ExpectedOutcome

steps = []

# Step 1: Search
steps.append(ExecutionStep(
    step_number=1, action="tool_use",
    description="Search knowledge base",
    duration_ms=340,
    output_preview=f"Found {len(results)} results",
))

# Step 2: Analyze with LLM
steps.append(ExecutionStep(
    step_number=2, action="llm_inference",
    description="Analyze search results",
    duration_ms=1200,
    llm_call=LlmCall(model="gpt-4o", provider="openai", prompt_tokens=800, completion_tokens=200),
))

td = InfiniumClient.create_task_data(
    name="Research: quarterly earnings",
    description="Multi-step research with source gathering and analysis",
    duration=3.2,
    steps=steps,
    expected_outcome=ExpectedOutcome(
        task_objective="Produce sourced analysis of quarterly earnings",
        required_deliverables=["Source list", "Analysis"],
    ),
    llm_usage=LlmUsage(model="gpt-4o", provider="openai", prompt_tokens=800, completion_tokens=200, api_calls_count=1),
)
client.send_task_data(td)

3. TraceBuilder -- Recommended for Real Agents

Steps are context managers that auto-capture timing and exceptions:

from infinium import TraceBuilder, LlmUsage

trace = TraceBuilder("Blog Post Generator", "Generate a publish-ready blog post.")
trace.set_expected_outcome(
    task_objective="Produce a 600-word blog post with SEO metadata",
    required_deliverables=["Blog post", "SEO title", "Meta description"],
)
trace.set_input_summary(f"Topic: {topic}")

with trace.step("llm_inference", "Create outline") as step:
    resp = openai.chat.completions.create(model="gpt-4o", messages=[...])
    outline = resp.choices[0].message.content
    step.set_output(outline[:500])
    step.record_llm_call("gpt-4o", provider="openai",
                         prompt_tokens=resp.usage.prompt_tokens,
                         completion_tokens=resp.usage.completion_tokens)

with trace.step("llm_inference", "Write full post") as step:
    resp = openai.chat.completions.create(model="gpt-4o", messages=[...])
    blog_post = resp.choices[0].message.content
    step.set_output(blog_post[:500])
    step.record_llm_call("gpt-4o", provider="openai",
                         prompt_tokens=resp.usage.prompt_tokens,
                         completion_tokens=resp.usage.completion_tokens)

trace.set_output_summary(f"Generated {len(blog_post.split())}-word blog post.")
task_data = trace.build()  # Auto-computes total duration
client.send_task_data(task_data)

@trace_agent Decorator

For single-function agents, the decorator handles everything -- duration, input/output capture, error recording, and sending:

from infinium import trace_agent

@trace_agent("Email Classifier", client)
def classify_email(email_body: str) -> dict:
    resp = openai.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "Classify this email. Return JSON with intent, priority, department."},
            {"role": "user", "content": email_body},
        ],
        response_format={"type": "json_object"},
    )
    return json.loads(resp.choices[0].message.content)

# Every call auto-sends a trace
result = classify_email("Hi, I was charged twice for my subscription...")

If the function raises, the error is captured in the trace, the trace is still sent, and the exception re-raised. Use @async_trace_agent for async functions, or @client.trace() which auto-detects both.

Prompt Studio

Retrieve managed prompts with optional variable substitution:

prompt = client.get_prompt(
    prompt_id="your-prompt-id",
    prompt_key="your-prompt-key",
    version="latest",
    variables={"customer_name": "Acme Corp", "tone": "professional"},
)

print(prompt.content)           # Raw template
print(prompt.rendered_content)  # With variables substituted

When used inside a @trace_agent or @client.trace() decorated function, prompt fetches are automatically recorded in the trace.

Async Support

Every feature has a sync and async variant:

import asyncio
from infinium import AsyncInfiniumClient

async def main():
    async with AsyncInfiniumClient(agent_id="...", agent_secret="...") as client:
        await client.send_task(
            name="Moderate content",
            description="Check user content for policy violations.",
            duration=1.8,
        )

asyncio.run(main())

Batch Sending

Send multiple traces at once:

tasks = [InfiniumClient.create_task_data(...) for doc in documents]
result = client.send_tasks_batch(tasks)
print(f"Sent {result.successful}/{result.successful + result.failed} traces")

Polling for Maestro Results

After sending a trace, poll until Maestro finishes analysis:

response = client.send_task_data(task_data)
trace_id = response.data.get("traceId")

interpretation = client.wait_for_interpretation(trace_id, timeout=120, poll_interval=3)
print(interpretation.data["interpretedTraceResult"])

Configuration

client = InfiniumClient(
    agent_id="your-agent-id",
    agent_secret="your-agent-secret",
    base_url="https://platform.i42m.ai/api/v1",  # API endpoint (default)
    timeout=30.0,                                  # Request timeout in seconds
    max_retries=3,                                 # Retry attempts (exponential backoff)
    enable_rate_limiting=True,                     # Client-side rate limiting
    requests_per_second=10.0,                      # Rate limit threshold
    verify_ssl=True,                               # Set False for self-signed certs
)

Error Handling

from infinium.exceptions import (
    AuthenticationError, RateLimitError, ValidationError,
    NetworkError, ServerError, InfiniumTimeoutError,
)

try:
    client.send_task_data(task_data)
except AuthenticationError:
    print("Bad credentials -- check agent_id and agent_secret")
except RateLimitError as e:
    print(f"Rate limited -- retry after {e.retry_after}s")
except ValidationError as e:
    print(f"Invalid input: {e.field} -- {e}")
except NetworkError:
    print("Cannot reach the API")
except InfiniumTimeoutError:
    print("Request timed out")

The SDK automatically retries on 5xx errors, network failures, and timeouts (up to max_retries). Auth, validation, and rate limit errors are not retried.

Data Types

Trace Enrichment

Type Purpose
ExecutionStep One thing the agent did (tool call, LLM call, decision)
ErrorDetail A factual error record (type, message, recoverable?)
LlmCall A single LLM invocation (model, tokens, latency)
ToolCall A tool/API invocation (name, status code, duration)
ExpectedOutcome What the agent was asked to produce (Maestro's grading rubric)
LlmUsage Aggregate token/cost stats across the whole trace
EnvironmentContext Runtime info (framework, Python version, SDK version)

Domain Sections

Attach context relevant to the agent's domain via keyword arguments or set_section():

Section Use Case
Customer Customer name, email, company
Sales Lead source, deal value, pipeline stage
Support Ticket ID, issue type, resolution
Research Topic, sources consulted, key findings
Development Language, framework, test coverage
Marketing Campaign, channel, target audience
Content Content type, word count, SEO data
Executive Meeting agenda, action items, decisions
Project Deliverables, stakeholders, blockers
General Tools used, tags, notes

Design Principle

The agent reports facts. Maestro evaluates. The SDK never asks the agent to score itself, say pass/fail, or rate its own confidence. It captures what happened -- timestamps, token counts, error types, what was produced -- and Maestro decides if the work was good.

Documentation

For in-depth guides, API reference, and advanced usage, see the full documentation.

Development

git clone <repository-url>
cd infinium-python-sdk
pip install -e ".[dev]"
pytest

Support

  • Create an issue in the project repository
  • Contact support at support@i42m.ai

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

infinium_o2-1.1.0.tar.gz (68.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

infinium_o2-1.1.0-py3-none-any.whl (47.9 kB view details)

Uploaded Python 3

File details

Details for the file infinium_o2-1.1.0.tar.gz.

File metadata

  • Download URL: infinium_o2-1.1.0.tar.gz
  • Upload date:
  • Size: 68.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for infinium_o2-1.1.0.tar.gz
Algorithm Hash digest
SHA256 afca21c7192606ef467254a69b4b178d3174d93f992c587a6e7d772970a3385d
MD5 424b10179ec1017d88e8de0b24c1b3b7
BLAKE2b-256 603830b3cdebbc23bf0af818e5a2464d228f936705c536c692c02817892e720b

See more details on using hashes here.

File details

Details for the file infinium_o2-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: infinium_o2-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 47.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for infinium_o2-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 afe46f8033aa449ca4a489a63e58fe2a3f0be327c08d808fc224737e3b1f7fbf
MD5 f4b0606db05b9878a01ed5aa8bfe8829
BLAKE2b-256 40d650c051e8c59720df7ab7a2355df3b076bdc4673b23577e482cd53e42b889

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page