Skip to main content

Infrastructure for efficient and scalable AI applications.

Project description

ai-infra

Build AI applications in minutes, not months.

PyPI CI Python License

Overview

One unified SDK for LLMs, agents, RAG, voice, images, and MCP—across 10+ providers.

Key Features

  • LLM Chat - Chat, streaming, structured output, retries across providers
  • Agents - Tool calling, human-in-the-loop, deep research mode
  • RAG - Embeddings, vector stores, retrieval pipelines
  • MCP - Client/server, OpenAPI->MCP conversion, tool discovery
  • Voice - Text-to-speech, speech-to-text, realtime conversations
  • Tracing - OpenTelemetry distributed tracing built-in

Why ai-infra?

Building AI apps means juggling OpenAI, Anthropic, Google, embeddings, vector stores, tool calling, MCP servers... each with different APIs and gotchas.

ai-infra gives you one clean interface that works everywhere:

from ai_infra import Agent

def search_web(query: str) -> str:
    """Search the web."""
    return f"Results for: {query}"

agent = Agent(tools=[search_web])
result = agent.run("Find the latest news about AI")
# Works with OpenAI, Anthropic, Google—same code.

Quick Install

pip install ai-infra

What's Included

Feature What You Get One-liner
LLM Chat Chat, streaming, structured output, retries LLM().chat("Hello")
Agents Tool calling, human-in-the-loop, deep mode Agent(tools=[...]).run(...)
RAG Embeddings, vector stores, retrieval Retriever().search(...)
MCP Client/server, OpenAPI->MCP, tool discovery MCPClient(url)
Voice Text-to-speech, speech-to-text, realtime TTS().speak(...)
Images DALL-E, Stability, Imagen generation ImageGen().generate(...)
Graph LangGraph workflows, typed state Graph().add_node(...)
Memory Context fitting, rolling summaries fit_context(messages, max_tokens=4000)
Workspace Sandboxed file operations for agents Workspace("./project")
Validation Prompt injection, PII detection validate_prompt(input)
Tracing OpenTelemetry distributed tracing configure_tracing(...)

30-Second Examples

Chat with any LLM

from ai_infra import LLM

llm = LLM()  # Uses OPENAI_API_KEY by default
response = llm.chat("Explain quantum computing in one sentence")
print(response)

# Switch providers instantly
llm = LLM(provider="anthropic", model="claude-sonnet-4-20250514")
response = llm.chat("Same question, different model")

Build an Agent with Tools

from ai_infra import Agent

def get_weather(city: str) -> str:
    """Get current weather for a city."""
    return f"72F and sunny in {city}"

def search_web(query: str) -> str:
    """Search the web for information."""
    return f"Top results for: {query}"

agent = Agent(tools=[get_weather, search_web])
result = agent.run("What's the weather in Tokyo and find me restaurants there")
# Agent automatically calls both tools and synthesizes the answer

RAG in 5 Lines

from ai_infra import Retriever

retriever = Retriever()
retriever.add_file("company_docs.pdf")
retriever.add_file("product_manual.md")

results = retriever.search("How do I reset my password?")
print(results[0].content)

Connect to MCP Servers

from ai_infra import MCPClient

async with MCPClient("http://localhost:8080") as client:
    tools = await client.list_tools()
    result = await client.call_tool("search", {"query": "AI news"})

Create an MCP Server

from ai_infra import mcp_from_functions

def search_docs(query: str) -> str:
    """Search documentation."""
    return f"Found: {query}"

mcp = mcp_from_functions(name="my-mcp", functions=[search_docs])
mcp.run(transport="stdio")

Supported Providers

Provider Chat Embeddings TTS STT Images Realtime
OpenAI Yes Yes Yes Yes Yes Yes
Anthropic Yes - - - - -
Google Yes Yes Yes Yes Yes Yes
xAI (Grok) Yes - - - - -
ElevenLabs - - Yes - - -
Deepgram - - - Yes - -
Stability AI - - - - Yes -
Replicate - - - - Yes -
Voyage AI - Yes - - - -
Cohere - Yes - - - -

Setup

# Set your API keys (use whichever providers you need)
export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...
export GOOGLE_API_KEY=...

# That's it. ai-infra auto-detects available providers.

Feature Highlights

Deep Agent (Autonomous Mode)

For complex, multi-step tasks:

from ai_infra import DeepAgent

agent = DeepAgent(
    goal="Analyze this codebase and generate documentation",
    tools=[read_file, write_file, search],
    max_iterations=50,
)

result = await agent.run()
print(result.output)

Includes: Planning, self-correction, progress tracking, human approval gates.

MCP Client with Interceptors

Advanced MCP features:

from ai_infra import MCPClient
from ai_infra.mcp import RetryInterceptor, CachingInterceptor, LoggingInterceptor

async with MCPClient(
    "http://localhost:8080",
    interceptors=[
        RetryInterceptor(max_retries=3),
        CachingInterceptor(ttl=300),
        LoggingInterceptor(),
    ]
) as client:
    # Automatic retries, caching, and logging for all tool calls
    result = await client.call_tool("expensive_operation", {...})

Includes: Callbacks, interceptors, prompts, resources, progress tracking.

RAG with Multiple Backends

from ai_infra import Retriever

# In-memory (development)
retriever = Retriever(backend="memory")

# SQLite (local persistence)
retriever = Retriever(backend="sqlite", path="./vectors.db")

# PostgreSQL with pgvector (production)
retriever = Retriever(backend="postgres", connection_string="...")

# Pinecone (managed cloud)
retriever = Retriever(backend="pinecone", index_name="my-index")

Voice & Multimodal

from ai_infra import TTS, STT

# Text to speech
tts = TTS(provider="elevenlabs")
audio = tts.speak("Hello, world!")

# Speech to text
stt = STT(provider="deepgram")
text = stt.transcribe("audio.mp3")

Image Generation

from ai_infra import ImageGen

gen = ImageGen(provider="openai")  # or "stability", "replicate"
image = gen.generate("A futuristic city at sunset")
image.save("city.png")

CLI Tools

# Test MCP connections
ai-infra mcp test --url http://localhost:8080

# List MCP tools
ai-infra mcp tools --url http://localhost:8080

# Call an MCP tool
ai-infra mcp call --url http://localhost:8080 --tool search --args '{"query": "test"}'

# Server info
ai-infra mcp info --url http://localhost:8080

Documentation

Section Description
Getting Started Installation, API keys, first example
Core
LLM Chat, streaming, structured output
Agent Tool calling, human-in-the-loop
Graph LangGraph workflows
RAG & Embeddings
Retriever Vector search, file loading
Embeddings Text embeddings
MCP
Client Connect to MCP servers
Server Create MCP servers
Multimodal
TTS Text-to-speech
STT Speech-to-text
Vision Image understanding
Advanced
Deep Agent Autonomous agents
Personas Agent personalities
Workspace Sandboxed file operations
Memory Context management, rolling summaries
Streaming Typed streaming events
Infrastructure
Validation Prompt/response validation
Tracing OpenTelemetry tracing
Callbacks Execution hooks
CLI Reference Command-line tools

Running Examples

git clone https://github.com/nfraxlab/ai-infra.git
cd ai-infra
poetry install

# Chat
poetry run python -c "from ai_infra import LLM; print(LLM().chat('Hello!'))"

# Agent
poetry run python examples/agents/01_basic_tools.py

# See more examples
ls examples/

Related Packages

ai-infra is part of the nfrax infrastructure suite:

Package Purpose
ai-infra AI/LLM infrastructure (agents, tools, RAG, MCP)
svc-infra Backend infrastructure (auth, billing, jobs, webhooks)
fin-infra Financial infrastructure (banking, portfolio, insights)

License

MIT License - use it for anything.


Project details


Release history Release notifications | RSS feed

This version

1.4.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_infra-1.4.0.tar.gz (355.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_infra-1.4.0-py3-none-any.whl (450.9 kB view details)

Uploaded Python 3

File details

Details for the file ai_infra-1.4.0.tar.gz.

File metadata

  • Download URL: ai_infra-1.4.0.tar.gz
  • Upload date:
  • Size: 355.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ai_infra-1.4.0.tar.gz
Algorithm Hash digest
SHA256 fbb88bccc7969dddc1c5d28cd50de6057d776ddff922386a63d4b502eff89ceb
MD5 d94fc62da2cc22324cacf58790349c29
BLAKE2b-256 f15a9b600678f46e875225b6ec770ad79b0b0f31a4c95f6b093509dacdd02fd1

See more details on using hashes here.

File details

Details for the file ai_infra-1.4.0-py3-none-any.whl.

File metadata

  • Download URL: ai_infra-1.4.0-py3-none-any.whl
  • Upload date:
  • Size: 450.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ai_infra-1.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 faab4a4232e34682954b68d893d52b64a651e9ebe81ff4e9e0e9388473d99f93
MD5 3efcf2369bf36a2b76957df6a3d131f8
BLAKE2b-256 45f78255350e5d88ed298783b08e8af9e5f5b9773d87f01309d7ec38f6463aac

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page