Skip to main content

Unified multi-provider LLM abstraction module with intelligent routing, cost tracking, and caching

Project description

StratifyAI

StratifyAI — Unified Multi‑Provider LLM Interface

Python License Tests Providers

Status: Phase 7.9 Complete
Providers: 9 Operational
Features: Routing • RAG • Caching • Streaming • CLI • Web UI • Vision • Smart Chunking

StratifyAI is a production‑ready Python framework that provides a unified interface for 9+ LLM providers, including OpenAI, Anthropic, Google, DeepSeek, Groq, Grok, OpenRouter, Ollama, and AWS Bedrock. It eliminates vendor lock‑in, simplifies multi‑model development, and enables intelligent routing, cost tracking, caching, streaming, and RAG workflows.


Features

Core

  • Unified API for 9+ LLM providers
  • Async-first architecture with sync wrappers
  • Automatic provider detection
  • Cost tracking and budget enforcement
  • Latency tracking on all responses
  • Retry logic with fallback models
  • Streaming support for all providers
  • Response caching + provider prompt caching
  • Intelligent routing (cost, quality, latency, hybrid)
  • Capability filtering (vision, tools, reasoning)
  • Model metadata and context window awareness
  • Builder pattern for fluent configuration
  • Vision support for image analysis (GPT-4o, Claude, Gemini, Nova)

Advanced

  • Large‑file handling with smart chunking and progressive summarization
  • File extraction (CSV schema, JSON schema, logs, code structure)
  • Auto model selection for extraction tasks
  • RAG pipeline with embeddings + vector DB (ChromaDB)
  • Semantic search and citation tracking
  • Rich/Typer CLI with interactive mode
  • Web UI with markdown rendering and syntax highlighting

Installation

git clone https://github.com/Bytes0211/stratifyai.git
cd stratifyai
pip install -e .

Or using uv:

uv sync

Configuration

cp .env.example .env
# Add your API keys

Check configured providers:

stratifyai check-keys

Quick Start

CLI Usage

stratifyai chat -p openai -m gpt-4o-mini -t "Hello"
stratifyai route "Explain relativity" --strategy hybrid
stratifyai interactive
stratifyai cache-stats

Python Example (LLMClient)

from stratifyai import LLMClient
from stratifyai.models import Message, ChatRequest, ChatResponse

client: LLMClient = LLMClient()
request: ChatRequest = ChatRequest(
    model="gpt-4o-mini",
    messages=[Message(role="user", content="Explain quantum computing")]
)

# Async (recommended)
response: ChatResponse = await client.chat_completion(request)

# Sync wrapper for scripts/CLI
response: ChatResponse = client.chat_completion_sync(request)

print(response.content)
print(f"Cost: ${response.usage.cost_usd:.6f}")
print(f"Latency: {response.latency_ms:.0f}ms")

Python Example (Chat Package - Simplified)

from stratifyai.chat import anthropic, openai
from stratifyai.models import ChatResponse

# Quick usage - model is always required
response: ChatResponse = await anthropic.chat("Hello!", model="claude-sonnet-4-5")
print(response.content)

# With options
response: ChatResponse = await openai.chat(
    "Explain quantum computing",
    model="gpt-4o-mini",
    system="Be concise",
    temperature=0.5
)

Builder Pattern (Fluent Configuration)

from stratifyai.chat import anthropic
from stratifyai.chat.builder import ChatBuilder
from stratifyai.models import ChatResponse

# Configure once, use multiple times
client: ChatBuilder = (
    anthropic
    .with_model("claude-sonnet-4-5")
    .with_system("You are a helpful assistant")
    .with_temperature(0.7)
)

# All subsequent calls use the configured settings
response: ChatResponse = await client.chat("Hello!")
response: ChatResponse = await client.chat("Tell me more")

# Stream with builder
async for chunk in client.chat_stream("Write a story"):
    print(chunk.content, end="", flush=True)

Routing

  • Cost: choose cheapest model
  • Quality: choose highest‑quality model
  • Latency: choose fastest model
  • Hybrid (default): dynamic weighting based on complexity

RAG

  • Embeddings (OpenAI)
  • ChromaDB vector storage
  • Semantic search
  • Document indexing
  • Retrieval‑augmented generation
  • Citation tracking

Project Structure

stratifyai/
├── llm_abstraction/      # Core package
│   ├── providers/        # Provider implementations (9 providers)
│   ├── router.py         # Intelligent routing
│   ├── models.py         # Data models
│   └── utils/            # Utilities (token counting, extraction)
├── chat/                 # Simplified chat modules with builder pattern
│   ├── builder.py        # ChatBuilder class
│   └── stratifyai_*.py    # Provider-specific modules
├── cli/                  # Typer CLI
├── api/                  # Optional FastAPI server
├── examples/             # Usage examples
└── docs/                 # Technical documentation

Testing

pytest           # Run all tests
pytest -v        # Verbose output

Test Coverage: 300+ tests across all modules


License

Internal project — All rights reserved.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stratifyai-0.1.3.tar.gz (1.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

stratifyai-0.1.3-py3-none-any.whl (1.7 MB view details)

Uploaded Python 3

File details

Details for the file stratifyai-0.1.3.tar.gz.

File metadata

  • Download URL: stratifyai-0.1.3.tar.gz
  • Upload date:
  • Size: 1.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for stratifyai-0.1.3.tar.gz
Algorithm Hash digest
SHA256 161f5a2cf03ac8a098445b305908f42ae605d5846e0570f1bf6000c772b2b618
MD5 722e83eface32457cd5ab6ebe5f4a2f9
BLAKE2b-256 6fb53327354bc1404ae6f6c3d8a61c19f478e24a01905dcfeccf619cc73a09c6

See more details on using hashes here.

File details

Details for the file stratifyai-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: stratifyai-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 1.7 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for stratifyai-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 a184277cdebb141da1ed5923c84c57d2e5a185b5f27378a4a6e1cf6b31ecb37b
MD5 81099cdcfd3cb82351b7316cf3ea50d3
BLAKE2b-256 125e0f053061f9cbaaf1cafa679b6e9a62e678df26ecd3ee5eb37387d23ea318

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page