Skip to main content

The Python substrate for observable agent engineering — LLM routing, policy governance, MCP integration, evaluation, skills packaging, realtime orchestration, and structured tracing.

Project description

ElectriPy Studio — The Python substrate for observable agent engineering

ElectriPy Studio

The Python substrate for observable agent engineering.

CI PyPI Release Python 3.11+ License: MIT

Overview

ElectriPy Studio is a curated collection of production-grade Python components for building observable, testable, and governable agent systems. It provides composable infrastructure for LLM routing, evaluation, policy enforcement, MCP integration, reusable skills packaging, realtime session orchestration, and telemetry-aware runtime execution — all without adopting a framework.

Use ElectriPy when you want typed, production-grade building blocks that compose into your architecture rather than a monolithic framework that owns it.

Why ElectriPy Studio

Problem What ElectriPy provides
Agent systems are hard to observe Observe — OpenTelemetry-aligned tracing with span kinds for LLM, agent, tool, retrieval, and policy operations
LLM calls need governance Policy Engine + Policy Gateway — rule-based access control, PII scanning, approval workflows, and request/response guardrails
Evaluation is an afterthought Evals + Eval Assertions — dataset-driven scoring, baseline drift detection, and pytest-native CI gating
Provider switching is costly LLM Gateway + Provider Adapters + Workload Router — swap providers without rewriting business logic; route by cost, latency, or capability
Tool integrations are fragile MCP Toolkit — strongly typed Model Context Protocol clients and server adapters
Agent knowledge is scattered Skills — versioned, validated, template-aware skill packages with manifest-driven composition
Streaming sessions are glue code Realtime — session lifecycle, event sequencing, tool-call orchestration, interruption, and backpressure in a provider-neutral runtime
No time to build infrastructure 30+ composable components — caching, retries, circuit breakers, JSON repair, cost tracking, batch fan-out, replay tapes, and more

Design principles

  • Ports & Adapters everywhere. Swap providers, stores, transports, and tools without rewriting business logic.
  • Deterministic by default. Stable IDs, reproducible evaluation runs, and guarded state machines.
  • Observable from day one. Structured tracing, telemetry hooks, and observer ports are built in — not bolted on.
  • Safe logging posture. Hashes and redaction seams instead of raw prompts in logs.
  • Typed, production APIs. Small public surfaces, strict typing, frozen dataclasses, and Protocol-based interfaces.
  • Testable without the network. 1,000+ tests run offline, deterministically, with no API keys required.

Architecture

graph TD
    subgraph Foundation
        CORE[Core — config, logging, errors]
        CONC[Concurrency — retry, rate limit, circuit breaker]
        IO[IO — JSONL read/write]
        CLI[CLI — commands & demos]
    end

    subgraph "Agent Infrastructure"
        GW[LLM Gateway]
        PA[Provider Adapters]
        WR[Workload Router]
        FC[Fallback Chain]
        BC[Batch Complete]
        SO[Structured Output]
    end

    subgraph "Observability & Governance"
        OBS[Observe — tracing & spans]
        TEL[Telemetry — adapters]
        POL[Policy Engine]
        PGW[Policy Gateway]
        SDS[Sensitive Data Scanner]
    end

    subgraph "Evaluation & Quality"
        EV[Evals — dataset scoring]
        EA[Eval Assertions — CI gating]
        RAG[RAG Eval Runner]
    end

    subgraph "Composition & Packaging"
        SK[Skills — versioned packages]
        MCP[MCP Toolkit]
        PE[Prompt Engine]
        TR[Tool Registry]
    end

    subgraph "Orchestration & Runtime"
        RT[Realtime — session orchestration]
        AC[Agent Collaboration]
        SC[Streaming Chat]
    end

    GW --> PA
    GW --> FC
    GW --> BC
    GW --> SO
    WR --> GW
    PGW --> GW
    POL --> PGW
    OBS --> TEL
    SK --> PE
    RT --> TR
    AC --> POL
    EV --> EA

Package map

Agent infrastructure

Package Purpose
llm_gateway Provider-agnostic sync/async LLM clients with request/response hooks
provider_adapters OpenAI, Anthropic, Ollama, and generic HTTP-JSON adapters
workload_router Policy-driven, cost/latency/capability-aware model selection and routing
fallback_chain Ranked provider failover with metadata tracking
batch_complete Concurrent LLM fan-out with bounded concurrency and per-request error isolation
structured_output Pydantic model extraction from LLM text with auto-retry and temperature decay
llm_cache Pluggable response caching (in-memory LRU, SQLite WAL) with hit-rate tracking
replay_tape Record, replay, and diff LLM interactions for deterministic offline tests

Observability & governance

Package Purpose
observe OpenTelemetry-aligned structured tracing with AI-specific span kinds (LLM, agent, tool, retrieval, policy, MCP)
telemetry Provider-agnostic telemetry adapters (JSONL, OpenTelemetry) for HTTP, LLM, policy, and RAG events
policy Enterprise policy engine — subject/resource/action rules, approval workflows, evidence requirements, escalation chains
policy_gateway Deterministic request/response guardrails with regex-based detection, sanitization, and multi-stage enforcement
sensitive_data_scanner PII and secret detection with 9+ built-in patterns and extensible custom rules

Evaluation & quality

Package Purpose
evals Dataset-driven evaluation framework with scoring, baseline comparison, and CI-friendly reporting
eval_assertions Pytest-native assertion helpers (keyword, regex, JSON schema, predicate, length) for LLM output validation
rag_eval_runner Retrieval benchmarking with precision/recall/MRR metrics and drift detection

Composition & packaging

Package Purpose
skills Versioned, validated skill packages with manifest-driven composition and {{variable}} template rendering
mcp Strongly typed Model Context Protocol toolkit for building MCP clients, servers, and tool adapters
prompt_engine Template composition, variable substitution, and few-shot example management
tool_registry Declarative tool definitions with JSON schema generation and OpenAI function-calling format

Orchestration & runtime

Package Purpose
realtime Session lifecycle orchestration — event sequencing, tool calls, interruption, backpressure, transport abstraction
agent_collaboration Bounded multi-agent handoff orchestration with hop limits and policy integration
streaming_chat Sync/async stream chunk primitives and text collection helpers
agent_runtime Deterministic tool-plan execution with step-by-step control

Core infrastructure

Package Purpose
core Configuration, structured logging, error hierarchy, type utilities
concurrency Retry (sync/async), rate limiting, circuit breaker for cascading failure protection
io JSONL read/write, data processing utilities
cli Typer-based CLI with health checks, RAG eval, and offline demo commands

Supporting components

Component Purpose
cost_ledger Thread-safe token cost accumulation with multi-label slicing
prompt_fingerprint Deterministic SHA-256 request hashing for caching, dedup, and drift detection
json_repair Fix 7 common LLM JSON breakage patterns in one call
conversation_memory Sliding-window and token-aware chat history management
context_assembly Priority-based context window packing and truncation
model_router Rule-based model selection (see also workload_router for the full routing engine)
token_budget Pluggable token counting and budget-aware truncation
hallucination_guard Grounding and citation verification checks
response_robustness JSON extraction, output guards, and structured response validation
rag_quality Retrieval quality metrics and drift comparison helpers

How ElectriPy compares

ElectriPy is not a framework — it is composable infrastructure. Import the pieces you need; leave the rest.

Library Overlap ElectriPy's edge
LiteLLM Provider-agnostic LLM gateway Bundles policy hooks, observability, structured output, and workload routing inline — no proxy server
Guardrails AI Input/output validation Lighter-weight, composable policy engine + gateway — no XML DSL or hosted dependency
CrewAI / AutoGen Multi-agent orchestration Bounded, deterministic collaboration with hop limits; building blocks, not a framework
RAGAS RAG evaluation Integrates eval directly into CI gating with drift comparison; ships scoring, assertions, and dataset harness
Instructor Structured LLM output Dedicated structured output engine with retry + temperature decay, plus caching, replay tape, and cost tracking
Haystack / LangChain Full RAG/agent framework Composable building blocks you import — not a framework you adopt wholesale

Status

  • Maturity: Early alpha — APIs may still evolve. Core components, agent infrastructure, and the full observability/governance/evaluation stack are implemented and tested.
  • Test suite: 1,000+ tests, all offline and deterministic.
  • Versioning: SemVer at v0.x — expect breaking changes until v1.0.

Quick start

Install

pip install electripy-studio

Verify

electripy doctor

Core usage

from electripy import Config, get_logger
from electripy.concurrency import retry

config = Config.from_env()
logger = get_logger(__name__)

@retry(max_attempts=3, delay=1.0, backoff=2.0)
def fetch_data():
    return api_call()

LLM Gateway with policy hooks

from electripy.ai.llm_gateway import LlmGatewaySyncClient
from electripy.ai.policy_gateway import PolicyGateway, PolicyRule, PolicyStage, PolicyAction

gateway = PolicyGateway(rules=[
    PolicyRule(
        rule_id="pii-email", code="PII_EMAIL",
        description="Mask emails",
        stage=PolicyStage.PREFLIGHT,
        pattern=r"[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+",
        action=PolicyAction.SANITIZE,
    ),
])

Evaluation in CI

from electripy.ai.eval_assertions import assert_llm_output

assert_llm_output("The capital of France is Paris.", contains=["Paris"], min_length=10)

Realtime session

from electripy.ai.realtime import RealtimeSessionService, RealtimeConfig, OutputStreamChunk

svc = RealtimeSessionService()
session = svc.create_session(config=RealtimeConfig(model="gpt-4o"))
svc.start_session(session.session_id)
svc.emit_output(session.session_id, OutputStreamChunk(index=0, text="Hello"))
svc.complete_session(session.session_id)

Demo: Policy + Agent Collaboration

electripy demo policy-collab

See recipes/03_policy_collaboration/ for the standalone script.

Documentation

Full documentation is served via MkDocs. Build and serve locally:

pip install -e ".[docs]"
mkdocs serve

Getting started

Agent infrastructure

Observability & governance

Evaluation & quality

Composition & packaging

Orchestration & runtime

Foundation

Reference

Project structure

electripy-studio/
├── src/electripy/
│   ├── core/                   # Config, logging, errors, typing
│   ├── concurrency/            # Retry, rate limiting, circuit breaker
│   ├── io/                     # JSONL utilities
│   ├── cli/                    # CLI commands & demos
│   └── ai/                     # Agent engineering components
│       ├── llm_gateway/        # Provider-agnostic LLM clients
│       ├── workload_router/    # Cost/latency/capability-aware model routing
│       ├── observe/            # Structured tracing & span lifecycle
│       ├── mcp/                # Model Context Protocol toolkit
│       ├── evals/              # Dataset-driven evaluation framework
│       ├── policy/             # Enterprise policy engine
│       ├── policy_gateway/     # Request/response guardrails
│       ├── skills/             # Versioned skill packaging
│       ├── realtime/           # Session orchestration & event pipeline
│       ├── agent_collaboration/# Multi-agent handoff orchestration
│       ├── structured_output/  # Pydantic extraction with retry
│       ├── eval_assertions/    # Pytest-native LLM output validation
│       ├── streaming_chat/     # Stream chunk primitives
│       ├── llm_cache/          # Response caching (LRU, SQLite)
│       ├── replay_tape/        # Record/replay/diff LLM interactions
│       ├── tool_registry/      # Declarative tool definitions
│       ├── prompt_engine/      # Template composition
│       ├── token_budget/       # Token counting & truncation
│       ├── context_assembly/   # Priority-based context packing
│       ├── agent_runtime/      # Deterministic tool-plan execution
│       ├── rag_eval_runner/    # Retrieval benchmarking
│       ├── rag_quality/        # Retrieval quality metrics
│       ├── hallucination_guard/# Grounding & citation checks
│       ├── response_robustness/# Output guards & JSON extraction
│       ├── model_router/       # Rule-based model selection
│       ├── conversation_memory/# Sliding-window chat history
│       ├── fallback_chain.py   # Provider failover
│       ├── batch_complete.py   # Concurrent LLM fan-out
│       ├── cost_ledger.py      # Token cost accumulation
│       ├── prompt_fingerprint.py # Request hashing
│       ├── json_repair.py      # LLM JSON breakage repair
│       └── sensitive_data_scanner.py # PII & secret detection
├── tests/                      # 1,000+ offline, deterministic tests
├── docs/                       # MkDocs documentation
├── recipes/                    # Runnable examples
│   ├── 01_cli_tool/
│   ├── 02_llm_gateway/
│   └── 03_policy_collaboration/
└── pyproject.toml

Recipes

Additional recipe guides in the docs:

Development

Running tests

pytest tests/ -v

With coverage:

pytest tests/ -v --cov=src --cov-report=term-missing

Code quality

ruff check .                  # Linting
black .                       # Formatting
mypy src/                     # Type checking

Python tooling (recommended)

These tools are optional but recommended for contributors:

pipx install uv               # Fast package manager
pipx install ruff              # Fast linter
pipx install pre-commit        # Git pre-commit hooks

uv venv .venv && source .venv/bin/activate
uv pip install -e ".[dev]"
pre-commit install

CI/CD

GitHub Actions automatically runs tests, linting, and type checking on all pull requests.

Requirements

  • Python 3.11 or higher
  • Dependencies managed via pyproject.toml

License

MIT License — see LICENSE for details.

Contributing

Contributions are welcome! Please read our Contributing Guide and Code of Conduct before submitting PRs. For security issues, see SECURITY.md.

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

electripy_studio-0.4.0.tar.gz (261.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

electripy_studio-0.4.0-py3-none-any.whl (276.6 kB view details)

Uploaded Python 3

File details

Details for the file electripy_studio-0.4.0.tar.gz.

File metadata

  • Download URL: electripy_studio-0.4.0.tar.gz
  • Upload date:
  • Size: 261.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for electripy_studio-0.4.0.tar.gz
Algorithm Hash digest
SHA256 1b79215a003c33a98fc3a163ce573c4a6a1c01471a1e7f9b891c94c9a81752af
MD5 b899f8e17305872b98099cdb10306400
BLAKE2b-256 f79660f57dcd043f9f743466ed8ec57132fdd6b710213663ff56d2265b900132

See more details on using hashes here.

File details

Details for the file electripy_studio-0.4.0-py3-none-any.whl.

File metadata

File hashes

Hashes for electripy_studio-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a15037d8114d0c0100cf57f089ed5379ab216f3700acd9b1e596d621a7ea8894
MD5 84413deb4e955f1fa268a155b7f688c3
BLAKE2b-256 8b115ee69c3ad61bb18e92b5771dca8baec9aca20ebe655dd2a206c592531d5b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page