Skip to main content

Debuggable runtime for AI agent pipelines

Project description


Binex
Binex

Debuggable runtime for AI agent pipelines
Orchestrate multi-agent workflows. Trace every step. Replay and diff runs.

CI License Python Tests Coverage Version Docs

Quickstart · Documentation · Report Bug · Request Feature



Why Binex?

Building multi-agent systems is hard. Debugging them is harder. Binex gives you:

  • YAML-first workflows — define agent pipelines as readable DAGs, not tangled code
  • Full execution tracing — every node call, every artifact, every millisecond recorded
  • Post-mortem debugging — inspect any run after the fact with rich, filterable reports
  • Replay with agent swap — re-run a workflow substituting different LLMs or agents
  • Run diffing — compare two executions side-by-side to spot regressions
  • Human-in-the-loop — approval gates and free-text input with conditional branching

(back to top)


Demo

A multi-provider research pipeline: Ollama runs locally for planning and summarization, OpenRouter calls cloud models for parallel research — all in one YAML file.

Requirements to run this demo
  • Ollama installed and running locally
  • Model pulled: ollama pull gemma3:4b
  • Free OpenRouter API key (set OPENROUTER_API_KEY in .env)
  • Binex installed: pip install -e .
# examples/multi-provider-demo.yaml
name: multi-provider-research

nodes:
  user_input:
    agent: "human://input"                          # ask the user for a topic

  planner:
    agent: "llm://ollama/gemma3:4b"                 # local LLM plans the research
    system_prompt: "Create a structured research plan with 3 subtopics..."
    inputs: { topic: "${user_input.result}" }
    depends_on: [user_input]

  researcher1:
    agent: "llm://openrouter/z-ai/glm-4.5-air:free"    # cloud model researches subtopic 1
    inputs: { plan: "${planner.result}" }
    depends_on: [planner]

  researcher2:
    agent: "llm://openrouter/stepfun/step-3.5-flash:free"  # cloud model researches subtopic 2
    inputs: { plan: "${planner.result}" }
    depends_on: [planner]

  summarizer:
    agent: "llm://ollama/gemma3:4b"                 # local LLM combines findings
    inputs: { research1: "${researcher1.result}", research2: "${researcher2.result}" }
    depends_on: [researcher1, researcher2]
graph LR
    A["user_input<br/><sub>human://input</sub>"] --> B["planner<br/><sub>ollama/gemma3:4b</sub>"]
    B --> C["researcher1<br/><sub>openrouter/glm-4.5-air</sub>"]
    B --> D["researcher2<br/><sub>openrouter/step-3.5-flash</sub>"]
    C --> E["summarizer<br/><sub>ollama/gemma3:4b</sub>"]
    D --> E

Run it, explore results, debug the execution:

Binex Demo — multi-provider research pipeline

(back to top)


Quickstart

# Clone
git clone https://github.com/Alexli18/binex.git
cd binex

# Create and activate virtual environment
python -m venv .venv
source .venv/bin/activate   # Linux/macOS
# .venv\Scripts\activate    # Windows

# Install
pip install -e .

# Run the zero-config demo
binex hello

# Run a workflow
binex run examples/simple.yaml --var input="hello world"

# Debug a completed run
binex debug <run-id>
binex debug latest          # shortcut for the most recent run

# Optional: rich colored output
pip install -e ".[rich]"
binex debug latest --rich
See it in action
$ binex hello

Running built-in hello-world workflow...

  [1/2] greeter ...
  [greeter] -> result:
Hello from Binex!

  [2/2] responder ...
  [responder] -> result:
{"greeter": "Hello from Binex!"}

Run completed (2/2 nodes)
Run ID: run_d71c9a50

Next steps:
  binex debug run_d71c9a50    — inspect the run
  binex init                  — create your own project
  binex run examples/simple.yaml — try a workflow file
$ binex run examples/simple.yaml --var input="hello world"

Run ID: run_69651bec
Workflow: simple-pipeline
Status: completed
Nodes: 2/2 completed
╭──────────────────────── consumer ────────────────────────╮
│ { "art_producer": { "msg": "hello world" } }             │
╰──────────────────────── result ──────────────────────────╯

(back to top)


Trace & Debug

Every run is fully recorded. Inspect the execution timeline and DAG:

binex trace <run-id>
binex trace

Compare two runs side-by-side — spot status changes, latency deltas, and output differences:

binex diff <run-a> <run-b>
binex diff

Post-mortem debug of a failed run — see errors, prompts, and artifacts per node:

binex debug <run-id> --errors --rich
binex debug

(back to top)


How It Works

Define a workflow in YAML. Binex builds a DAG, schedules nodes respecting dependencies, dispatches each to the right agent adapter, and records everything.

name: research-pipeline
description: "Fan-out research with human approval"

nodes:
  planner:
    agent: "llm://openai/gpt-4"
    system_prompt: "Break this topic into 3 research questions"
    inputs:
      topic: "${user.topic}"
    outputs: [questions]

  researcher_1:
    agent: "llm://anthropic/claude-sonnet-4-20250514"
    inputs: { question: "${planner.questions}" }
    outputs: [findings]
    depends_on: [planner]

  researcher_2:
    agent: "a2a://localhost:8001"
    inputs: { question: "${planner.questions}" }
    outputs: [findings]
    depends_on: [planner]

  reviewer:
    agent: "human://approve"
    inputs:
      draft: "${researcher_1.findings}"
    outputs: [decision]
    depends_on: [researcher_1, researcher_2]

  summarizer:
    agent: "llm://openai/gpt-4"
    inputs:
      research: "${researcher_1.findings}"
    outputs: [summary]
    depends_on: [reviewer]
    when: "${reviewer.decision} == approved"
graph TD
    A[planner] --> B[researcher_1]
    A --> C[researcher_2]
    B --> D["reviewer (human approval)"]
    C --> D
    D -->|approved| E[summarizer]

(back to top)


Architecture

block-beta
    columns 3

    CLI["CLI\nrun · debug · trace · replay · diff · dev"]:3
    Runtime["Runtime\nOrchestrator + Dispatcher"]:3
    Adapters["Adapters\nlocal:// · llm:// · a2a:// · human://"] Graph["Graph\nDAG · topo-sort · cycle detect"] Spec["Workflow Spec\nYAML loader · validation"]
    Stores["Stores\nSQLite executions + FS artifacts"]:3
    Models["Models\nWorkflow · Node · Artifact · Execution"]:3

(back to top)


Features

Agent Adapters

Prefix Adapter Description
local:// LocalPythonAdapter In-process Python callable
llm:// LLMAdapter LLM completion via LiteLLM (40+ providers)
a2a:// A2AAgentAdapter Remote agent via A2A protocol
human://input HumanInputAdapter Terminal prompt for free-text input
human://approve HumanApprovalAdapter Approval gate with conditional branching

CLI Commands

Command Description
binex run <workflow.yaml> Execute a workflow
binex debug <run-id|latest> Post-mortem inspection (--json, --errors, --node, --rich)
binex trace <run-id> Execution timeline, node details, or DAG graph
binex replay <run-id> Re-run with optional agent swaps
binex diff <run1> <run2> Compare two runs side-by-side
binex artifacts list <run-id> List artifacts with lineage tracking
binex validate <workflow.yaml> Validate YAML before execution
binex scaffold workflow "A -> B" Generate workflow from DSL shorthand
binex start Interactive wizard to create a workflow step-by-step
binex init Interactive project setup (workflow / agent / full)
binex dev up Start Docker dev stack (Ollama + LiteLLM + Registry)
binex doctor Check system health
binex explore Interactive browser for runs and artifacts
binex hello Zero-config demo

DSL Shorthand

Generate workflows from simple expressions:

binex scaffold workflow "planner -> researcher, analyst -> summarizer"

Nine built-in patterns available: simple, diamond, fan-out, fan-in, map-reduce, and more.

LLM Providers

Out-of-the-box support for 9 providers via LiteLLM:

OpenAI · Anthropic · Google Gemini · Ollama · OpenRouter · Groq · Mistral · DeepSeek · Together AI

(back to top)


Project Structure

src/binex/
├── adapters/        # Agent execution backends (local, LLM, A2A, human)
├── agents/          # Built-in agent implementations
├── cli/             # Click CLI commands
├── graph/           # DAG construction + topological scheduling
├── models/          # Pydantic v2 domain models
├── registry/        # FastAPI agent registry service
├── runtime/         # Orchestrator, dispatcher, lifecycle
├── stores/          # SQLite execution + filesystem artifact persistence
├── trace/           # Debug reports, lineage, timeline, diffing
├── workflow_spec/   # YAML loader + validator + variable resolution
└── tools.py         # Tool calling support (@tool decorator)

(back to top)


Built With

Python FastAPI Pydantic SQLite LiteLLM Docker Click pytest

(back to top)


Examples

The examples/ directory contains 22 ready-to-run workflows:

Example What it demonstrates
hello-world.yaml Minimal two-node pipeline
diamond.yaml Diamond dependency pattern
fan-out-fan-in.yaml Parallel research with aggregation
human-in-the-loop.yaml Approval gates and conditional branching
multi-provider-research.yaml Multiple LLM providers in one workflow
a2a-multi-agent.yaml Remote agents via A2A protocol
conditional-routing.yaml Branch based on node output
map-reduce.yaml MapReduce-style aggregation

(back to top)


Documentation

Full docs available at alexli18.github.io/binex:

(back to top)


Development

# Clone
git clone https://github.com/Alexli18/binex.git
cd binex

# Create virtual environment
python -m venv .venv
source .venv/bin/activate

# Install with dev dependencies
pip install -e ".[dev]"

# Run tests (870 tests, 96% coverage)
python -m pytest tests/

# Lint
ruff check src/

# Start dev environment (Ollama + LiteLLM + Registry)
binex dev up

(back to top)


Roadmap

See ROADMAP.md for the full roadmap, or a summary below:

  • Web UI for execution visualization
  • Plugin system for custom adapters
  • Framework adapters (LangChain, CrewAI, AutoGen)
  • Workflow versioning and migration
  • Distributed execution across multiple runtimes
  • OpenTelemetry integration for observability

See the open issues for a full list of proposed features and known issues.

(back to top)


Contributing

Contributions are welcome! Here's how:

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/amazing-feature)
  3. Commit your Changes (git commit -m 'Add amazing feature')
  4. Push to the Branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

(back to top)


License

Distributed under the MIT License. See LICENSE for more information.

(back to top)


Built with focus on debuggability, because AI agents shouldn't be black boxes.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

binex-0.1.0a1.tar.gz (3.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

binex-0.1.0a1-py3-none-any.whl (97.8 kB view details)

Uploaded Python 3

File details

Details for the file binex-0.1.0a1.tar.gz.

File metadata

  • Download URL: binex-0.1.0a1.tar.gz
  • Upload date:
  • Size: 3.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for binex-0.1.0a1.tar.gz
Algorithm Hash digest
SHA256 3bde6367307418f34a82e46b6bae646bbabd17dcfced120d206853c298e22332
MD5 1a52ab02aca2d1cb83b76aa28286fb68
BLAKE2b-256 ee1aaac4af42fe818551eaa16a926a37a738bd6e2fcc4e0992b010a3cdca995a

See more details on using hashes here.

File details

Details for the file binex-0.1.0a1-py3-none-any.whl.

File metadata

  • Download URL: binex-0.1.0a1-py3-none-any.whl
  • Upload date:
  • Size: 97.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for binex-0.1.0a1-py3-none-any.whl
Algorithm Hash digest
SHA256 93a0b0c9704b865fc6b9b7315a8ed25400f52443dbf8c4ec29d8a181b01e6212
MD5 4299cb828aa0513b3b18d393c5c8228c
BLAKE2b-256 b3504e6f349555c32f17836a065812e95c36abaca9a22e67b07da020159eeaa6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page