Skip to main content

Unified multi-provider LLM abstraction module with intelligent routing, cost tracking, and caching

Project description

StratifyAI

StratifyAI — Unified Multi‑Provider LLM Interface

Python License Tests Providers

Status: Phase 7.8 Complete
Providers: 9 Operational
Features: Routing • RAG • Caching • Streaming • CLI • Web UI • Builder Pattern

StratifyAI is a production‑ready Python framework that provides a unified interface for 9+ LLM providers, including OpenAI, Anthropic, Google, DeepSeek, Groq, Grok, OpenRouter, Ollama, and AWS Bedrock. It eliminates vendor lock‑in, simplifies multi‑model development, and enables intelligent routing, cost tracking, caching, streaming, and RAG workflows.


Features

Core

  • Unified API for 9+ LLM providers
  • Async-first architecture with sync wrappers
  • Automatic provider detection
  • Cost tracking and budget enforcement
  • Latency tracking on all responses
  • Retry logic with fallback models
  • Streaming support for all providers
  • Response caching + provider prompt caching
  • Intelligent routing (cost, quality, latency, hybrid)
  • Capability filtering (vision, tools, reasoning)
  • Model metadata and context window awareness
  • Builder pattern for fluent configuration

Advanced

  • Large‑file handling with chunking and progressive summarization
  • File extraction (CSV schema, JSON schema, logs, code structure)
  • Auto model selection for extraction tasks
  • RAG pipeline with embeddings + vector DB (ChromaDB)
  • Semantic search and citation tracking
  • Rich/Typer CLI with interactive mode
  • Optional FastAPI web interface

Installation

git clone https://github.com/Bytes0211/stratifyai.git
cd stratifyai
pip install -e .

Or using uv:

uv sync

Configuration

cp .env.example .env
# Add your API keys

Check configured providers:

stratifyai check-keys

Quick Start

CLI Usage

stratifyai chat -p openai -m gpt-4o-mini -t "Hello"
stratifyai route "Explain relativity" --strategy hybrid
stratifyai interactive
stratifyai cache-stats

Python Example (LLMClient)

from stratifyai import LLMClient
from stratifyai.models import Message, ChatRequest, ChatResponse

client: LLMClient = LLMClient()
request: ChatRequest = ChatRequest(
    model="gpt-4o-mini",
    messages=[Message(role="user", content="Explain quantum computing")]
)

# Async (recommended)
response: ChatResponse = await client.chat_completion(request)

# Sync wrapper for scripts/CLI
response: ChatResponse = client.chat_completion_sync(request)

print(response.content)
print(f"Cost: ${response.usage.cost_usd:.6f}")
print(f"Latency: {response.latency_ms:.0f}ms")

Python Example (Chat Package - Simplified)

from stratifyai.chat import anthropic, openai
from stratifyai.models import ChatResponse

# Quick usage - model is always required
response: ChatResponse = await anthropic.chat("Hello!", model="claude-sonnet-4-5")
print(response.content)

# With options
response: ChatResponse = await openai.chat(
    "Explain quantum computing",
    model="gpt-4o-mini",
    system="Be concise",
    temperature=0.5
)

Builder Pattern (Fluent Configuration)

from stratifyai.chat import anthropic
from stratifyai.chat.builder import ChatBuilder
from stratifyai.models import ChatResponse

# Configure once, use multiple times
client: ChatBuilder = (
    anthropic
    .with_model("claude-sonnet-4-5")
    .with_system("You are a helpful assistant")
    .with_temperature(0.7)
)

# All subsequent calls use the configured settings
response: ChatResponse = await client.chat("Hello!")
response: ChatResponse = await client.chat("Tell me more")

# Stream with builder
async for chunk in client.chat_stream("Write a story"):
    print(chunk.content, end="", flush=True)

Routing

  • Cost: choose cheapest model
  • Quality: choose highest‑quality model
  • Latency: choose fastest model
  • Hybrid (default): dynamic weighting based on complexity

RAG

  • Embeddings (OpenAI)
  • ChromaDB vector storage
  • Semantic search
  • Document indexing
  • Retrieval‑augmented generation
  • Citation tracking

Project Structure

stratifyai/
├── llm_abstraction/      # Core package
│   ├── providers/        # Provider implementations (9 providers)
│   ├── router.py         # Intelligent routing
│   ├── models.py         # Data models
│   └── utils/            # Utilities (token counting, extraction)
├── chat/                 # Simplified chat modules with builder pattern
│   ├── builder.py        # ChatBuilder class
│   └── stratifyai_*.py    # Provider-specific modules
├── cli/                  # Typer CLI
├── api/                  # Optional FastAPI server
├── examples/             # Usage examples
└── docs/                 # Technical documentation

Testing

pytest           # Run all tests
pytest -v        # Verbose output

Test Coverage: 300+ tests across all modules


License

Internal project — All rights reserved.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stratifyai-0.1.1.tar.gz (213.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

stratifyai-0.1.1-py3-none-any.whl (120.4 kB view details)

Uploaded Python 3

File details

Details for the file stratifyai-0.1.1.tar.gz.

File metadata

  • Download URL: stratifyai-0.1.1.tar.gz
  • Upload date:
  • Size: 213.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for stratifyai-0.1.1.tar.gz
Algorithm Hash digest
SHA256 1f1ed1c08f7fbbed363ff77b761a3e79280f53d72ee923101c4ae520758c9213
MD5 b71f29cc3145c8d61cd5b2ddea129116
BLAKE2b-256 3742088daba6e0d7e72a974567a921495d2822be850c6ab117f5f9f313a83659

See more details on using hashes here.

File details

Details for the file stratifyai-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: stratifyai-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 120.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for stratifyai-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 fde769f8b962456ff8fb1ef975417669df36a9a2e4f31ffe8f9beef102a01fe6
MD5 c43eacc179d7e6697b515baa4064a14b
BLAKE2b-256 5f2b7b72d5a3654db68e6165bbe01c306561c41e466a41db9c4f54d37ca73301

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page