Python SDK for Inference Provider API - Build powerful AI agents with RAG, tool calling, and MCP integration

These details have not been verified by PyPI

Project links

Project description

Inference Provider SDK (Python)

Python SDK for Inference Provider V2 API. Build powerful AI agents with RAG, tool calling, and MCP integration.

Installation

pip install inference-provider-sdk

Or with poetry:

poetry add inference-provider-sdk

Quick Start

from inference_provider import InferenceProviderClient

# Initialize client
client = InferenceProviderClient(
    api_key="ip_xxxxxxxxxx",
    api_secret="xxxxxxxxxxxxxx"
)

# Run agent inference
response = client.agents.run(
    agent_id="your-agent-id",
    user_message="Hello, world!"
)

print(response.response)
print(f"Cost: ${response.usage.cost}")
print(f"Tokens: {response.usage.total_tokens}")

Features

✅ Full Python type hints with Pydantic
✅ Both sync and async support
✅ Agent inference with conversation history
✅ RAG (Retrieval-Augmented Generation)
✅ Tool calling (REST API, JavaScript, MCP)
✅ MCP server integration
✅ Provider and model management
✅ Custom response formatting
✅ Automatic retry with exponential backoff
✅ Rate limit handling
✅ Context manager support

Authentication

Environment Variables

export INFERENCE_API_KEY=ip_xxxxxxxxxx
export INFERENCE_API_SECRET=xxxxxxxxxxxxxx

# Auto-loads from environment
client = InferenceProviderClient()

Explicit Configuration

client = InferenceProviderClient(
    api_key="ip_xxxxxxxxxx",
    api_secret="xxxxxxxxxxxxxx",
    base_url="https://your-instance.supabase.co",  # Optional
    timeout=60,  # Optional, default 60s
    max_retries=3,  # Optional, default 3
    debug=False  # Optional, default False
)

Usage Examples

Basic Agent Inference

from inference_provider import InferenceProviderClient

client = InferenceProviderClient()

response = client.agents.run(
    agent_id="agent-id",
    user_message="What is the weather today?"
)

print(response.response)

With Conversation History

from inference_provider.types import ConversationMessage

history = [
    ConversationMessage(role="user", content="My name is Alice"),
    ConversationMessage(role="assistant", content="Nice to meet you, Alice!")
]

response = client.agents.chat(
    agent_id="agent-id",
    message="What is my name?",
    history=history
)

print(response.response)

With RAG

response = client.agents.run_with_rag(
    agent_id="agent-id",
    message="Tell me about our product features",
    collection_id="collection-id",
    match_threshold=0.8,
    match_count=5
)

if response.rag:
    print(f"Found {response.rag.results_count} relevant documents")
    for result in response.rag.results:
        print(f"Similarity: {result.similarity}")
        print(f"Content: {result.content}")

With Vision (Image Inputs)

from inference_provider.types import ImageInput

response = client.agents.run_with_vision(
    agent_id="agent-id",
    message="What do you see in this image?",
    images=[
        ImageInput(
            type="image_url",
            image_url={"url": "data:image/jpeg;base64,/9j/4AAQSkZJRg..."}
        )
    ]
)

print(response.response)

With Variable Substitution

response = client.agents.run(
    agent_id="agent-id",
    user_message="Process this request",
    variables={
        "user_name": "Alice",
        "company_name": "Acme Corp"
    }
)

Agent Management

Create Agent

from inference_provider.types import Variable, VariableType

agent = client.agents.create(
    name="Customer Support Agent",
    description="Handles customer inquiries",
    system_prompt="You are a helpful customer support agent for {{company_name}}",
    model_name="gpt-4",
    temperature=0.7,
    max_tokens=2000,
    variables=[
        Variable(
            name="company_name",
            type=VariableType.TEXT,
            description="Company name",
            default_value="Acme Corp"
        )
    ],
    tags=["customer-support", "production"]
)

print(f"Created agent: {agent.id}")

List Agents

agents = client.agents.list()
active_agents = client.agents.list(is_active=True)

for agent in agents:
    print(f"{agent.name} ({agent.id})")

Update Agent

updated = client.agents.update(
    agent_id="agent-id",
    temperature=0.8,
    system_prompt="Updated prompt"
)

Delete Agent

client.agents.delete("agent-id")

Async Support

import asyncio
from inference_provider import AsyncInferenceProviderClient

async def main():
    async with AsyncInferenceProviderClient() as client:
        response = await client.agents.run(
            agent_id="agent-id",
            user_message="Hello, world!"
        )
        print(response.response)

asyncio.run(main())

Error Handling

from inference_provider import (
    InferenceProviderClient,
    AuthenticationError,
    ValidationError,
    NotFoundError,
    RateLimitError,
    APIError,
    NetworkError
)

try:
    response = client.agents.run(
        agent_id="agent-id",
        user_message="Hello"
    )
except AuthenticationError as e:
    print(f"Invalid credentials: {e}")
except ValidationError as e:
    print(f"Invalid input: {e.message} (field: {e.field})")
except NotFoundError as e:
    print(f"Resource not found: {e}")
except RateLimitError as e:
    print(f"Rate limit exceeded: {e.message}")
    print(f"Reset time: {e.reset_time}")
except APIError as e:
    print(f"API error: {e.status_code} - {e.message}")
except NetworkError as e:
    print(f"Network error: {e}")

Context Manager

# Sync
with InferenceProviderClient() as client:
    response = client.agents.run(
        agent_id="agent-id",
        user_message="Hello"
    )

# Async
async with AsyncInferenceProviderClient() as client:
    response = await client.agents.run(
        agent_id="agent-id",
        user_message="Hello"
    )

Utility Functions

Text Chunking for RAG

from inference_provider.utils import chunk_text

text = "Long document text..."
chunks = chunk_text(text, chunk_size=500, chunk_overlap=50)

for i, chunk in enumerate(chunks):
    print(f"Chunk {i + 1}: {chunk[:100]}...")

Variable Substitution

from inference_provider.utils import substitute_variables, extract_variable_names

template = "Hello {{name}}, welcome to {{company}}!"
variables = {"name": "Alice", "company": "Acme Corp"}

result = substitute_variables(template, variables)
# => "Hello Alice, welcome to Acme Corp!"

names = extract_variable_names(template)
# => ["name", "company"]

Type Safety

The SDK provides comprehensive type hints:

from inference_provider.types import (
    Agent,
    AgentInferenceResponse,
    AIProvider,
    AIModel,
    ToolDefinition,
    DocumentCollection,
    MCPServer
)

# Full type safety with IDE auto-completion
response: AgentInferenceResponse = client.agents.run(
    agent_id="agent-id",
    user_message="Hello"
)

print(response.usage.total_tokens)
print(response.agent.name)

Development

# Install dependencies
poetry install

# Run tests
pytest

# Run tests with coverage
pytest --cov=inference_provider

# Format code
black inference_provider tests

# Lint
ruff inference_provider tests

# Type check
mypy inference_provider

License

MIT

Support

Documentation: https://docs.inference-provider.com
Issues: https://github.com/your-org/inference-provider/issues
Email: support@inference-provider.com

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.0.0

Nov 8, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

inference_provider_sdk-1.0.0.tar.gz (21.2 kB view details)

Uploaded Nov 8, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

inference_provider_sdk-1.0.0-py3-none-any.whl (26.5 kB view details)

Uploaded Nov 8, 2025 Python 3

File details

Details for the file inference_provider_sdk-1.0.0.tar.gz.

File metadata

Download URL: inference_provider_sdk-1.0.0.tar.gz
Upload date: Nov 8, 2025
Size: 21.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for inference_provider_sdk-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`47c8e229a66fd328b71a70501f72b354c3ad5aa1e8456408ddc3cfbbf00552bc`
MD5	`647778f3a55d5616b231cb5263074526`
BLAKE2b-256	`1e142b25b4089a7ad6d18549d96f8c7391310bba05b8966fdddae65cfb541e66`

See more details on using hashes here.

File details

Details for the file inference_provider_sdk-1.0.0-py3-none-any.whl.

File metadata

Download URL: inference_provider_sdk-1.0.0-py3-none-any.whl
Upload date: Nov 8, 2025
Size: 26.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for inference_provider_sdk-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`48becd3eaaa1a5ca752bf783c9fe3b1942b8bdc8f044beda9b696be05c28a41a`
MD5	`d1c69aeb5981abd302d9434c24a9dc67`
BLAKE2b-256	`17dd3110b75f1ee51878eb482560869c17c6c48c4332d13cf15bac015be32244`

See more details on using hashes here.

inference-provider-sdk 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Inference Provider SDK (Python)

Installation

Quick Start

Features

Authentication

Environment Variables

Explicit Configuration

Usage Examples

Basic Agent Inference

With Conversation History

With RAG

With Vision (Image Inputs)

With Variable Substitution

Agent Management

Create Agent

List Agents

Update Agent

Delete Agent

Async Support

Error Handling

Context Manager

Utility Functions

Text Chunking for RAG

Variable Substitution

Type Safety

Development

License

Support

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes