Skip to main content

Intelligent delegation infrastructure for multi-agent AI systems

Project description

delegato

Intelligent delegation infrastructure for multi-agent AI systems. A protocol layer that sits between goals and agents — providing the organizational intelligence that governs how agents coordinate.

Python 3.11+ License MIT Tests 306 passed Coverage 94% arXiv

Highlights

Core pipeline

  • Decomposition — LLM breaks complex goals into verifiable sub-tasks with dependency DAGs
  • Assignment — multi-objective scoring ranks agents by capability, trust, availability, and cost
  • Verification — contract-first output checking with five methods and multi-judge consensus
  • Consensus — N independent LLM judges reduce correlated verification failures
  • Parallel DAG — independent sub-tasks run concurrently with configurable parallelism

Safety & adaptability

  • Trust — per-agent, per-capability scores with time-based decay and asymmetric updates
  • Retry & reassign — failed tasks retry, then reassign to the next-best agent
  • Privilege attenuation — permissions narrow as delegation depth increases
  • Circuit breakers — sudden trust drops pause contracts and fire escalation events
  • Audit — every delegation decision is recorded with async event callbacks
  • LiteLLM — single LLM wrapper supports 100+ providers out of the box

Quick Start

pip install delegato
import asyncio
from delegato import Agent, Delegator, Task, TaskResult, VerificationMethod, VerificationSpec

# Self-contained — no API keys needed
async def mock_llm(messages):
    system = messages[0]["content"].lower()
    if "task decomposition" in system:
        return {"subtasks": [
            {"goal": "Do the work", "required_capabilities": ["general"],
             "verification_method": "none", "dependencies": []}
        ]}
    return {"score": 1.0, "reasoning": "ok"}

async def my_handler(task):
    return TaskResult(task_id=task.id, agent_id="worker", output="Hello from delegato!", success=True)

agent = Agent(id="worker", name="Worker", capabilities=["general"], handler=my_handler)
delegator = Delegator(agents=[agent], llm_call=mock_llm)

task = Task(
    goal="Complete a simple task",
    verification=VerificationSpec(method=VerificationMethod.NONE),
)

result = asyncio.run(delegator.run(task))
print(f"Success: {result.success}, Output: {result.output}")

Replace mock_llm with a real LLM call (any provider via LiteLLM) for production use.

Architecture

flowchart TB
    User["User / Client<br/><code>delegator.run(task)</code>"]

    subgraph DELEGATOR
        DE["Decomposition Engine<br/><i>LLM-powered</i>"]
        AS["Assignment Scorer<br/><i>multi-objective</i>"]
        TT["Trust Tracker<br/><i>time-based decay</i>"]
        CL["Coordination Loop<br/><i>parallel DAG execution</i>"]
        VE["Verification Engine<br/><i>multi-judge consensus</i>"]
        PM["Permission Manager<br/><i>privilege attenuation</i>"]
        EV["Event Bus + Audit Log"]
        LLM["LLM via LiteLLM<br/><i>100+ providers</i>"]
    end

    A["Agent A<br/><i>any framework</i>"]
    B["Agent B<br/><i>any framework</i>"]
    C["Agent C<br/><i>any framework</i>"]

    User --> DE
    DE --> AS
    TT --> AS
    AS --> CL
    CL --> A
    CL --> B
    CL --> C
    A --> VE
    B --> VE
    C --> VE
    VE --> CL
    CL --> PM
    CL --> EV
    DE --> LLM
    VE --> LLM

How It Works

Delegato receives a high-level goal, decomposes it into a dependency graph of sub-tasks, assigns each to the best-fit agent, executes them in parallel where possible, and verifies every output against its contract. Failed tasks are retried or reassigned automatically.

flowchart LR
    CF["Complexity<br/>Floor Check"]
    T{"Trivial?"}
    FP["Fast Path<br/><i>direct execution</i>"]
    D["Decompose"]
    AE["Assign +<br/>Execute"]
    V["Verify"]
    P{"Passed?"}
    S["Success"]
    AD["Adapt<br/><i>retry / reassign</i>"]

    CF --> T
    T -- Yes --> FP
    T -- No --> D
    D --> AE
    AE --> V
    V --> P
    P -- Yes --> S
    P -- No --> AD
    AD --> AE
  1. Complexity floor — Tasks with complexity ≤ 2 and high reversibility skip decomposition when a trusted agent (trust ≥ 0.7) is available.
  2. Decompose — The LLM breaks the goal into sub-tasks, each with a verification method. Sub-tasks form a DAG with dependency edges.
  3. Assign & execute — Sub-tasks dispatch in parallel batches (topological order). Each is scored: 0.35 × capability + 0.30 × trust + 0.20 × availability + 0.15 × cost.
  4. Verify — Outputs are checked against contracts. LLM judge verification can use multiple independent judges with consensus voting.
  5. Adapt — Failures retry with the same agent, then reassign to the next-best. If all options are exhausted, the task escalates. Trust scores update after every outcome.

Demo — See It Run

The research pipeline demo shows the full delegation lifecycle: decomposition, assignment, verification failure, retry, and trust updates — all with mock agents (no API keys needed).

python examples/research_pipeline.py
==================================================
  delegato — Research Pipeline Demo
==================================================

[DECOMPOSE]  Breaking task into 3 sub-tasks...
[ASSIGN]  task → searcher
[EXECUTE]  searcher running...
[VERIFY]  regex: PASS (Regex matched: drug discovery|AI.+pharma|molecule)
[TRUST]  searcher.web_search: 0.50 → 0.55
[COMPLETE]  searcher done
[ASSIGN]  task → analyzer
[EXECUTE]  analyzer running...
[VERIFY]  regex: PASS (Regex matched: confidence)
[TRUST]  analyzer.data_analysis: 0.50 → 0.55
[COMPLETE]  analyzer done
[ASSIGN]  task → synthesizer
[EXECUTE]  synthesizer running...
[VERIFY]  llm_judge: FAIL (Only 2 examples found, need at least 3)
[TRUST]  synthesizer.summarization: 0.50 → 0.40
[VERIFY]  llm_judge: PASS (3 examples, 487 words, good quality)
[TRUST]  synthesizer.summarization: 0.40 → 0.46
[COMPLETE]  synthesizer done

==================================================
  RESULT: SUCCESS
  Total time: 0.0s | Cost: $0.025 | Reassignments: 0
==================================================

The synthesizer's first attempt fails verification (only 2 examples instead of the required 3). Trust drops from 0.50 to 0.40. On retry, it produces a passing summary — trust partially recovers to 0.46 but remains below baseline, reflecting the earlier failure.

Core Concepts

  • Task — Atomic unit of work. Each task carries a goal, required capabilities, verification spec, priority, complexity, and reversibility level. Tasks form DAGs when decomposed.
  • Agent — Registered worker with declared capabilities. The handler is any async callable — agents from LangGraph, CrewAI, AutoGen, or plain functions all plug in the same way.
  • Delegator — Main orchestrator. Decomposes tasks, scores agents, manages execution, verifies outputs, and handles failures. All components are wired together here.
  • Verification — Contract-first output checking. Five methods: LLM_JUDGE (subjective quality), REGEX (pattern matching), SCHEMA (JSON validation), FUNCTION (custom logic), NONE. Multi-judge consensus runs N independent evaluations for high-stakes tasks.
  • Trust — Per-agent, per-capability scores. Start at 0.5, update asymmetrically (failures penalize more than successes reward), and decay toward 0.5 over time when idle.

Advanced Features

Multi-Judge Consensus

VerificationSpec(
    method=VerificationMethod.LLM_JUDGE,
    criteria="Report has 3+ examples with cited sources",
    judges=3,                    # 3 independent evaluations
    consensus_threshold=0.66,    # 2/3 must agree to pass
)

Circuit Breakers

If an agent's trust drops by more than 0.3 in a single task, all active contracts for that agent are paused and a TRUST_CIRCUIT_BREAK event fires.

Complexity Floor

Tasks with complexity <= 2 and reversibility == HIGH bypass the full pipeline when a trusted agent (trust >= 0.7) is available, reducing overhead for trivial operations.

Event System

from delegato import DelegationEventType

async def on_completed(event):
    print(f"Task {event.task_id} completed by {event.agent_id}")

delegator.on(DelegationEventType.TASK_COMPLETED, on_completed)

All event types: TASK_DECOMPOSED, TASK_ASSIGNED, TASK_STARTED, TASK_COMPLETED, TASK_FAILED, VERIFICATION_PASSED, VERIFICATION_FAILED, TRUST_UPDATED, TRUST_CIRCUIT_BREAK, TASK_REASSIGNED, ESCALATED

API Reference

Category Exports
Orchestrator Delegator
Models Agent, Task, TaskResult, DelegationResult, Contract, TaskDAG, VerificationSpec, Permission
Enums VerificationMethod, TaskStatus, Reversibility, DelegationEventType
Engines DecompositionEngine, VerificationEngine, CoordinationLoop
Infrastructure EventBus, TrustTracker, AssignmentScorer, AuditLog, PermissionManager
LLM complete, complete_json, LLMError

Testing

pip install -e ".[dev]"
pytest

306 tests, 94% coverage, all mock-based — no API keys or external services needed.

Paper Reference

Based on "Intelligent AI Delegation" (Tomasev et al., Google DeepMind, Feb 2026) — arXiv:2602.11865. A local copy is available in docs/.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

delegato-0.1.0.tar.gz (1.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

delegato-0.1.0-py3-none-any.whl (30.9 kB view details)

Uploaded Python 3

File details

Details for the file delegato-0.1.0.tar.gz.

File metadata

  • Download URL: delegato-0.1.0.tar.gz
  • Upload date:
  • Size: 1.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for delegato-0.1.0.tar.gz
Algorithm Hash digest
SHA256 745f4679fe9808a830aeb9f38ad5052188417bc50e2c4328ce7f3f8af61c6f41
MD5 0275ccd0b5388c5b4c6a5788c7858071
BLAKE2b-256 4ddbee15bb338f8520cd8b6e7e15594a177ea96724f712e0d4c53cfd3ea0a919

See more details on using hashes here.

File details

Details for the file delegato-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: delegato-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 30.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for delegato-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ee6f3d6107a5d4b9f5b55e49dd305c821bbb59b3711c6e507119dcbb281767ea
MD5 305ae59f069691dbf2e521e0cbd0e888
BLAKE2b-256 550cd89aa844d569634d25e010257ed4388f8fa0e8f2e2464ecac12b58a8d12b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page