Skip to main content

Deterministic Speckit pipeline abstraction for AI agent integration

Project description

Coagula

Coagula Logo

Python 3.10+ Tests License: MIT

Deterministic pipeline abstraction for AI agents. Turn SOPs into strictly typed, validated pipelines. CLI + Python API.

Coagula encapsulates standard operating procedures (SOPs) — "Speckits" — into deterministic micro-workers. The orchestrator (Hermes, etc.) decides when to run a Speckit; Coagula executes it step-by-step and returns a validated Pydantic model.

Features

  • Strict data contracts — Inputs and outputs validated with Pydantic.
  • Multi-provider — OpenAI, Anthropic, Gemini, or any OpenAI-compatible (DeepSeek, OpenRouter, Groq).
  • Output modesverbose, concise, or technical (programmatic).
  • Custom response models — Any BaseModel subclass per pipeline.
  • Automatic retries — Configurable retry on validation failure.
  • Asyncawait engine.arun(...) for async orchestrators.
  • Typed BridgeResulthandle_tool_call() returns a typed result with both attribute and dict-style access.
  • Auto-patch for OpenAI-compatible — No manual monkey-patching needed.
  • CLI + Python API — Run pipelines from terminal or integrate.

Quick Start

1. Install

pip install coagula

# For development:
pip install -e ".[dev]"

2. Set your API key

export OPENAI_API_KEY="sk-...">

# For OpenAI-compatible providers (DeepSeek, OpenRouter, Groq):
export OPENAI_BASE_URL="https://api.deepseek.com/v1"
export OPENAI_API_KEY="sk-..."

3. Install phidata extras

pip install 'phidata[openai]'   # for OpenAI
pip install 'phidata[anthropic]' # for Anthropic
pip install 'phidata[gemini]'    # for Gemini

4. CLI usage

# Verbose mode (default) — full analysis + steps + decision
coagula --data-source "Q3 revenue: $12.4M, COGS: $7.1M" \
        --objective "Determine profitability"

# Technical mode — structured output, minimal prose
coagula -d "Define a CLI tool..." -o "Architecture plan" --mode technical --json

# Concise mode — shorter output
coagula -d "data..." -o "analyze" --mode concise

# Custom provider
coagula -p openai -m deepseek-v4-flash -d "..." -o "..." --json

5. Python API

from coagula import OrchestratorBridge, ToolCall

bridge = OrchestratorBridge()
bridge.register_pipeline("data_analysis")

tool_call = ToolCall(
    name="data_analysis",
    arguments={
        "data_source": "Q3 revenue: $12.4M, COGS: $7.1M.",
        "business_objective": "Determine profitability.",
    },
    tool_call_id="call_abc123",
)

result = bridge.handle_tool_call(tool_call)

# Typed access (recommended)
if result.success:
    print(f"Decision: {result.data.final_decision}")
    print(f"Confidence: {result.data.confidence_score}")
else:
    print(f"Pipeline failed: {result.error}")

# Dict-style access (backward compatible)
sr = result["result"]  # -> model_dump()
print(sr["final_decision"])

Output Modes

Mode Flag Use Case Behavior
verbose (default) (none) Human reading Full analysis, detailed steps, long decision
concise --mode concise Quick summaries Short analysis, 3 steps max, direct decision
technical --mode technical Programmatic use Minimal prose, structured data in details field

Custom Response Models

Each pipeline can use a different output schema:

from pydantic import BaseModel
from coagula import SpeckitEngine, SpeckitConfig

class MySchema(BaseModel):
    command: str
    args: list[str]

engine = SpeckitEngine(config=SpeckitConfig(
    response_model=MySchema,
    output_mode="technical",
))
result = engine.run(data_source="...", business_objective="...")
# result is MySchema, not SpeckitResult
print(result.command, result.args)

Rich Output with details

In technical mode, the details field holds arbitrary structured data:

from coagula import SpeckitEngine, SpeckitConfig

engine = SpeckitEngine(config=SpeckitConfig(
    output_mode="technical",
    instructions=[
        "Put the JSON schema in details['schema']",
        "Put the task list in details['tasks']",
    ],
))
result = engine.run(data_source="...", business_objective="...")
if result.details:
    print(result.details.get("schema"))
    print(result.details.get("tasks"))

Async Execution

result = await engine.arun(data_source="...", business_objective="...")

Returns the same model type (SpeckitResult or custom).

Hermes / Agent Integration

from coagula import OrchestratorBridge, ToolCall, get_speckit_tool_schema

# Expose the tool schema to your orchestrator
schema = get_speckit_tool_schema()

bridge = OrchestratorBridge()
bridge.register_pipeline("execute_speckit_data_pipeline")

def on_tool_call(name, arguments, tool_call_id):
    tc = ToolCall(name=name, arguments=arguments, tool_call_id=tool_call_id)
    result = bridge.handle_tool_call(tc)
    if result.success:
        return OrchestratorBridge.format_as_tool_response(
            tool_call_id=tool_call_id,
            content=result.data.model_dump(),
        )
    else:
        return {"role": "tool", "tool_call_id": tool_call_id,
                "content": f'{{"error": "{result.error}"}}'}

Multi-Provider Setup

OpenAI (default)

export OPENAI_API_KEY="sk-...```

### Anthropic
```bash
export ANTHROPIC_API_KEY="sk-ant-..."
pip install 'phidata[anthropic]'
coagula -p anthropic -m claude-opus-4 -d "..." -o "..."

OpenAI-compatible (DeepSeek, OpenRouter, Groq)

Coagula auto-detects OPENAI_BASE_URL and patches phidata to avoid the unsupported developer role. No manual monkey-patching needed.

export OPENAI_API_KEY="sk-....port OPENAI_BASE_URL="https://api.deepseek.com/v1"
pip install 'phidata[openai]'
coagula -p openai -m deepseek-v4-flash -d "..." -o "..."

Note: Default model_id is gpt-4o. Always set --model or config.model_id for non-OpenAI providers.

CLI Reference

coagula --data-source <text> --objective <text> [options]

Options:
  -d, --data-source TEXT    Raw data to analyze (required)
  -o, --objective TEXT      Business objective / goal (required)
  -p, --provider TEXT       LLM provider (default: openai)
  -m, --model TEXT          Model ID (default: gpt-4o)
  -r, --max-retries INT     Max retries on failure (default: 3)
  --mode TEXT               Output mode: verbose, concise, technical
  --details                 Show the details field in output
  --register NAME           Register pipeline under a custom name
  -l, --list-pipelines      List registered pipelines
  --json                    Output as JSON
  -h, --help                Show this help

Environment variables:
  OPENAI_API_KEY            Required for openai provider
  ANTHROPIC_API_KEY         Required for anthropic provider
  GEMINI_API_KEY            Required for gemini provider
  OPENAI_BASE_URL           Set for OpenAI-compatible providers

Configuration

from coagula import SpeckitConfig, SpeckitEngine

config = SpeckitConfig(
    provider="openai",              # any provider string
    model_id="gpt-4o",              # model ID for the provider
    max_retries=3,                  # 0-10 retries on failure
    output_mode="verbose",          # verbose | concise | technical
    response_model=SpeckitResult,   # custom BaseModel subclass
    instructions=[                  # custom SOP instructions
        "1. Analyze data_source based on business_objective.",
        "2. Do not ask questions. Assume conservative defaults.",
    ],
)

engine = SpeckitEngine(config=config)
result = engine.run(data_source="...", business_objective="...")

Models

class SpeckitResult(BaseModel):
    context_analysis: str
    executed_steps: list[str]
    final_decision: str
    confidence_score: float       # 0.0 to 1.0
    details: dict[str, Any] | None

class SpeckitConfig(BaseModel):
    provider: str                # any string (was Literal)
    model_id: str                # default: gpt-4o
    max_retries: int             # 0-10, default 3
    instructions: list[str] | None
    output_mode: Literal["verbose", "concise", "technical"]
    response_model: type[BaseModel] | None  # default: SpeckitResult

class BridgeResult(BaseModel):
    success: bool
    tool_call_id: str
    data: BaseModel | None       # SpeckitResult or custom model
    error: str | None

Error Handling

from coagula.exceptions import (
    CoagulaError,          # Base — catch-all
    ValidationError,       # Bad input data
    ExecutionError,        # LLM failure
    ConfigurationError,    # Missing provider/module
    RetryExhaustedError,   # All retries exhausted
)

In BridgeResult, errors are never raised as exceptions. Check result.success and result.error instead.

Development

make dev      # pip install -e ".[dev]"
make test     # pytest (65 tests)
make mypy     # strict type check
make ci       # all of the above
make clean    # remove caches and build artifacts

Architecture

┌─────────────────┐     Tool Call      ┌─────────────────────┐
│   Orchestrator  │ ──────────────────> │   OrchestratorBridge │
│  (Hermes, etc.) │                     │                      │
│                 │ <────────────────── │  ┌─────────────────┐ │
└─────────────────┘   JSON result       │  │ SpeckitEngine   │ │
                                         │  │ (Phidata Agent) │ │
                                         │  └─────────────────┘ │
                                         └─────────────────────┘
Module Responsibility
models Pydantic contracts (SpeckitResult, BridgeResult, SpeckitConfig)
engine Phidata execution engine with retry, async, auto-patch
tools JSON schema generation + pipeline registry
bridge Orchestrator integration adapter
exceptions Type-safe error hierarchy
__main__ CLI entrypoint

Pitfalls

  1. phidata extras: pip install 'phidata[openai]' (or anthropic/gemini). Coagula lazy-imports and raises ConfigurationError if missing.
  2. OpenAI-compatible model IDs: Default is gpt-4o. Set --model for non-OpenAI providers (e.g. deepseek-v4-flash for DeepSeek).
  3. Hermes schema sanitizer: Avoid allOf in tool schemas. Use description fields instead.
  4. Engine caching: Bridge caches engines by pipeline name. Call unregister_pipeline() before re-registering with a different config.
  5. Mypy: Run as python -m mypy -p coagula (not mypy src/coagula).

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

coagula-0.2.0.tar.gz (39.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

coagula-0.2.0-py3-none-any.whl (22.1 kB view details)

Uploaded Python 3

File details

Details for the file coagula-0.2.0.tar.gz.

File metadata

  • Download URL: coagula-0.2.0.tar.gz
  • Upload date:
  • Size: 39.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for coagula-0.2.0.tar.gz
Algorithm Hash digest
SHA256 12381048dc6fb9d238f55681d54aa89508cdb2cdbe0b88cc1ddc89cf4a25123b
MD5 44c0fd92ce64269de10d6a4cb1143561
BLAKE2b-256 527a376e73f405034ecd053d28b76f361c3cf26811ae2278aa17a8f370e2b399

See more details on using hashes here.

File details

Details for the file coagula-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: coagula-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 22.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for coagula-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 55ddd10e0dda7bcd41b9139bd164b46562340501fe3e9a6b3a257a56d8557928
MD5 3083c21eeaa4d045b748ff4ee814fa7a
BLAKE2b-256 87faa0f38a143af7929267634b13f280e6cb03970483f690a48ace66116ccd21

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page