Python SDK for Inference Provider API - Build powerful AI agents with RAG, tool calling, and MCP integration
Project description
Inference Provider SDK (Python)
Python SDK for Inference Provider V2 API. Build powerful AI agents with RAG, tool calling, and MCP integration.
Installation
pip install inference-provider-sdk
Or with poetry:
poetry add inference-provider-sdk
Quick Start
from inference_provider import InferenceProviderClient
# Initialize client
client = InferenceProviderClient(
api_key="ip_xxxxxxxxxx",
api_secret="xxxxxxxxxxxxxx"
)
# Run agent inference
response = client.agents.run(
agent_id="your-agent-id",
user_message="Hello, world!"
)
print(response.response)
print(f"Cost: ${response.usage.cost}")
print(f"Tokens: {response.usage.total_tokens}")
Features
- ✅ Full Python type hints with Pydantic
- ✅ Both sync and async support
- ✅ Agent inference with conversation history
- ✅ RAG (Retrieval-Augmented Generation)
- ✅ Tool calling (REST API, JavaScript, MCP)
- ✅ MCP server integration
- ✅ Provider and model management
- ✅ Custom response formatting
- ✅ Automatic retry with exponential backoff
- ✅ Rate limit handling
- ✅ Context manager support
Authentication
Environment Variables
export INFERENCE_API_KEY=ip_xxxxxxxxxx
export INFERENCE_API_SECRET=xxxxxxxxxxxxxx
# Auto-loads from environment
client = InferenceProviderClient()
Explicit Configuration
client = InferenceProviderClient(
api_key="ip_xxxxxxxxxx",
api_secret="xxxxxxxxxxxxxx",
base_url="https://your-instance.supabase.co", # Optional
timeout=60, # Optional, default 60s
max_retries=3, # Optional, default 3
debug=False # Optional, default False
)
Usage Examples
Basic Agent Inference
from inference_provider import InferenceProviderClient
client = InferenceProviderClient()
response = client.agents.run(
agent_id="agent-id",
user_message="What is the weather today?"
)
print(response.response)
With Conversation History
from inference_provider.types import ConversationMessage
history = [
ConversationMessage(role="user", content="My name is Alice"),
ConversationMessage(role="assistant", content="Nice to meet you, Alice!")
]
response = client.agents.chat(
agent_id="agent-id",
message="What is my name?",
history=history
)
print(response.response)
With RAG
response = client.agents.run_with_rag(
agent_id="agent-id",
message="Tell me about our product features",
collection_id="collection-id",
match_threshold=0.8,
match_count=5
)
if response.rag:
print(f"Found {response.rag.results_count} relevant documents")
for result in response.rag.results:
print(f"Similarity: {result.similarity}")
print(f"Content: {result.content}")
With Vision (Image Inputs)
from inference_provider.types import ImageInput
response = client.agents.run_with_vision(
agent_id="agent-id",
message="What do you see in this image?",
images=[
ImageInput(
type="image_url",
image_url={"url": "data:image/jpeg;base64,/9j/4AAQSkZJRg..."}
)
]
)
print(response.response)
With Variable Substitution
response = client.agents.run(
agent_id="agent-id",
user_message="Process this request",
variables={
"user_name": "Alice",
"company_name": "Acme Corp"
}
)
Agent Management
Create Agent
from inference_provider.types import Variable, VariableType
agent = client.agents.create(
name="Customer Support Agent",
description="Handles customer inquiries",
system_prompt="You are a helpful customer support agent for {{company_name}}",
model_name="gpt-4",
temperature=0.7,
max_tokens=2000,
variables=[
Variable(
name="company_name",
type=VariableType.TEXT,
description="Company name",
default_value="Acme Corp"
)
],
tags=["customer-support", "production"]
)
print(f"Created agent: {agent.id}")
List Agents
agents = client.agents.list()
active_agents = client.agents.list(is_active=True)
for agent in agents:
print(f"{agent.name} ({agent.id})")
Update Agent
updated = client.agents.update(
agent_id="agent-id",
temperature=0.8,
system_prompt="Updated prompt"
)
Delete Agent
client.agents.delete("agent-id")
Async Support
import asyncio
from inference_provider import AsyncInferenceProviderClient
async def main():
async with AsyncInferenceProviderClient() as client:
response = await client.agents.run(
agent_id="agent-id",
user_message="Hello, world!"
)
print(response.response)
asyncio.run(main())
Error Handling
from inference_provider import (
InferenceProviderClient,
AuthenticationError,
ValidationError,
NotFoundError,
RateLimitError,
APIError,
NetworkError
)
try:
response = client.agents.run(
agent_id="agent-id",
user_message="Hello"
)
except AuthenticationError as e:
print(f"Invalid credentials: {e}")
except ValidationError as e:
print(f"Invalid input: {e.message} (field: {e.field})")
except NotFoundError as e:
print(f"Resource not found: {e}")
except RateLimitError as e:
print(f"Rate limit exceeded: {e.message}")
print(f"Reset time: {e.reset_time}")
except APIError as e:
print(f"API error: {e.status_code} - {e.message}")
except NetworkError as e:
print(f"Network error: {e}")
Context Manager
# Sync
with InferenceProviderClient() as client:
response = client.agents.run(
agent_id="agent-id",
user_message="Hello"
)
# Async
async with AsyncInferenceProviderClient() as client:
response = await client.agents.run(
agent_id="agent-id",
user_message="Hello"
)
Utility Functions
Text Chunking for RAG
from inference_provider.utils import chunk_text
text = "Long document text..."
chunks = chunk_text(text, chunk_size=500, chunk_overlap=50)
for i, chunk in enumerate(chunks):
print(f"Chunk {i + 1}: {chunk[:100]}...")
Variable Substitution
from inference_provider.utils import substitute_variables, extract_variable_names
template = "Hello {{name}}, welcome to {{company}}!"
variables = {"name": "Alice", "company": "Acme Corp"}
result = substitute_variables(template, variables)
# => "Hello Alice, welcome to Acme Corp!"
names = extract_variable_names(template)
# => ["name", "company"]
Type Safety
The SDK provides comprehensive type hints:
from inference_provider.types import (
Agent,
AgentInferenceResponse,
AIProvider,
AIModel,
ToolDefinition,
DocumentCollection,
MCPServer
)
# Full type safety with IDE auto-completion
response: AgentInferenceResponse = client.agents.run(
agent_id="agent-id",
user_message="Hello"
)
print(response.usage.total_tokens)
print(response.agent.name)
Development
# Install dependencies
poetry install
# Run tests
pytest
# Run tests with coverage
pytest --cov=inference_provider
# Format code
black inference_provider tests
# Lint
ruff inference_provider tests
# Type check
mypy inference_provider
License
MIT
Support
- Documentation: https://docs.inference-provider.com
- Issues: https://github.com/your-org/inference-provider/issues
- Email: support@inference-provider.com
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file inference_provider_sdk-1.0.0.tar.gz.
File metadata
- Download URL: inference_provider_sdk-1.0.0.tar.gz
- Upload date:
- Size: 21.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
47c8e229a66fd328b71a70501f72b354c3ad5aa1e8456408ddc3cfbbf00552bc
|
|
| MD5 |
647778f3a55d5616b231cb5263074526
|
|
| BLAKE2b-256 |
1e142b25b4089a7ad6d18549d96f8c7391310bba05b8966fdddae65cfb541e66
|
File details
Details for the file inference_provider_sdk-1.0.0-py3-none-any.whl.
File metadata
- Download URL: inference_provider_sdk-1.0.0-py3-none-any.whl
- Upload date:
- Size: 26.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
48becd3eaaa1a5ca752bf783c9fe3b1942b8bdc8f044beda9b696be05c28a41a
|
|
| MD5 |
d1c69aeb5981abd302d9434c24a9dc67
|
|
| BLAKE2b-256 |
17dd3110b75f1ee51878eb482560869c17c6c48c4332d13cf15bac015be32244
|