Skip to main content

Forge — an open-source, enterprise-ready multi-agent orchestration platform with cost awareness, governance, and security built in.

Project description

Forge

The open-source multi-agent orchestration platform for teams that ship.

Build, run, optimize, and govern teams of AI agents for real business workflows — with cost-awareness, security, and compliance built in from line one.

CI PyPI License: MIT Python 3.11+ Typed Status: Beta


Why Forge?

A single AI agent is a clever assistant. A coordinated team of agents is a force multiplier — it can plan, divide work, call tools, check itself, and deliver a finished outcome instead of a suggestion.

But most teams can't put multi-agent systems into production. The demos are impressive; the operational reality is not. The questions that kill adoption are always the same:

"What is this going to cost? Who can run it? What happens when a tool misbehaves? Can we prove what it did for the auditors? Will it leak our data?"

Forge is built to answer those questions out of the box. It is an orchestration core that treats cost, security, governance, and observability as first-class features — not afterthoughts you bolt on before launch.

import asyncio
from forge import Orchestrator

async def main() -> None:
    # Zero config. Runs fully offline with the deterministic echo provider, and
    # automatically uses Claude or GPT when an API key is present in the environment.
    async with Orchestrator() as forge:
        result = await forge.run(
            "Research our top 3 competitors, draft a comparison, and compute Q3 growth at 18%",
            mode="supervisor",   # spawns parallel workers, one per subtask
        )
        print(result.output)
        print(result.usage.format_table())   # tokens + cost, per model and per agent

asyncio.run(main())

What's shipped (v0.2.0)

An honest snapshot of what works today versus what is on the way. Everything marked Shipped is implemented, typed, and covered by the test suite.

Feature Status
Multi-agent orchestration (supervisor + parallel workers) Shipped
Intelligent model routing (cost_optimized, quality_first, balanced, fixed) Shipped
Anthropic provider (Claude Haiku 4.5, Sonnet 4.6, Opus 4.8, Fable 5) Shipped
OpenAI provider (gpt-4o-mini, gpt-4o, gpt-4.1, o3) Shipped
Offline deterministic provider (zero config, no API key) Shipped
Pre-flight + per-step budget caps Shipped
Tool sandboxing (allowlist/denylist, timeouts, dangerous-denied-by-default) Shipped
RBAC (admin / operator / developer / viewer) Shipped
Prompt-injection heuristics + input sanitization Shipped
SHA-256 hash-chained tamper-evident audit log Shipped
PII redaction (emails, cards, SSNs, IPs, phones) Shipped
Event bus (21 lifecycle event types) Shipped
Streaming token output through the event bus (stream=True) Shipped
Per-run cost reporting (tokens + USD, per model, per agent) Shipped
Conversation memory + in-memory RAG vector store Shipped
CLI (forge run, forge models, forge audit) Shipped
47 tests, mypy strict, ruff clean, CI on 3.11 / 3.12 / 3.13 Shipped
Durable memory backends (pgvector, SQLite-VSS) Planned
OpenTelemetry export for traces and metrics Planned
Ollama / Bedrock / Vertex providers Planned
Policy-as-code for tool governance Planned
Hosted SaaS control plane (TypeScript / Next.js) Future

The force-multiplier thesis

Without Forge With Forge
One prompt, one answer, no division of labor A supervisor decomposes the goal and delegates to specialized workers
Workers run one at a time Workers run in parallel — with a pre-flight budget check before any of them start
Every call hits your most expensive model Intelligent routing sends easy work to cheap models, hard work to frontier models
Cost is a surprise on the invoice Cost is tracked per run, per agent, per model — with hard budget caps
Tools run with full trust Tools run sandboxed, with side-effecting tools denied by default
"Trust me, it worked" A tamper-evident, hash-chained audit trail of every action
Prototype you can't ship A typed, tested core designed for production

Key features

Multi-agent orchestration

  • Supervisor + dynamic workers. A supervisor breaks a goal into independent subtasks and spawns a focused worker agent for each, then synthesizes the result.
  • True parallelism. Workers run concurrently via asyncio.gather in bounded batches. A configurable max_workers cap (default 5) keeps fan-out under control, so the supervisor never blocks on one worker while others could be progressing.
  • Graceful failure isolation. A crashing worker emits WORKER_FAILED, records the error to the audit log, and returns a partial result — its peers are never cancelled, and the run completes with everything that succeeded.
  • Real agentic loop. Workers reason, call tools, observe results, and iterate until done — with a hard step budget so they never spin forever.

Intelligent routing & cost optimization

  • Capability/price-aware router with cost_optimized, quality_first, balanced, and fixed strategies.
  • One pricing source of truth. The model registry knows real list pricing across all providers (Claude Haiku 4.5, Sonnet 4.6, Opus 4.8, Fable 5; GPT-4o-mini, GPT-4o, GPT-4.1, o3) and computes spend consistently.
  • Budgets that bite — twice. A pessimistic pre-flight check estimates the worst-case spend of a worker batch and refuses to start it if that would blow the remaining budget. A precise, real-time check then fires after every model call. Together they bound both over-spend and wasted API calls.

Tools, memory & RAG — extensible by design

  • @tool decorator turns any typed Python function into an agent tool, with JSON-Schema generated automatically from your type hints and docstring.
  • Pluggable memory. Short-term conversation memory plus a dependency-free in-memory vector store for RAG — swap in any backend behind one tiny interface.
  • Provider-agnostic core. Anthropic (Claude) and OpenAI (GPT / o-series) ship in the box alongside a deterministic offline echo provider; add any provider by implementing one method.

Security from the start

  • Tool sandboxing with allowlists/denylists, per-tool timeouts, and dangerous (network/filesystem) tools denied unless explicitly allowed.
  • Prompt-injection heuristics and input normalization on untrusted goals.
  • RBAC — map your IdP groups onto roles (admin, operator, developer, viewer) and gate who can run agents or use dangerous tools.

Compliance & governance

  • Tamper-evident audit log. Every model call, tool call, plan, and decision is written to an append-only, SHA-256 hash-chained JSONL trail. Edits break the chain — and forge audit detects it.
  • PII redaction of logs and audit records (emails, cards, SSNs, IPs, phones).
  • Data-residency & retention hints recorded on every entry for GDPR/SOC 2 stories.

Observability built in

  • A structured event bus emits 21 distinct lifecycle event types — run, agent, and worker start/finish (including WORKER_STARTED and WORKER_FAILED for parallel execution), model routing and calls, tool calls, budget thresholds, and security violations — so any subscriber (console, audit, metrics) sees the same stream.
  • Token streaming. Run with stream=True and the platform emits TOKEN_STREAM_START / TOKEN_CHUNK / TOKEN_STREAM_END events — each tagged with the agent that produced it, so you can render live output even across parallel workers. forge run "..." --stream gives the classic live-typing terminal feel.
  • Structured logging (human or JSON) and a per-run usage/cost report broken down per model and per agent.

Install

pip install agentforge-oss                       # core (works offline, zero config)
pip install "agentforge-oss[anthropic]"          # + Claude provider
pip install "agentforge-oss[openai]"             # + OpenAI / GPT provider
pip install "agentforge-oss[anthropic,openai]"   # both real providers
pip install "agentforge-oss[all,dev]"            # everything + test/lint tooling

Note: The PyPI package is agentforge-oss — so pip install agentforge-oss. The import is still import forge. This follows the same convention as pip install Pillow then import PIL.

Forge runs fully offline out of the box using a deterministic echo provider, so you can explore the whole platform — routing, tools, supervision, audit — without an API key. Set ANTHROPIC_API_KEY to route automatically to Claude, or OPENAI_API_KEY to route to GPT. Both keys can be set at once; Forge prefers Anthropic by default (configurable).


Quickstart (CLI)

# Works with no API key — uses the offline provider.
forge run "Plan a product launch and calculate 15% of 3,400" --verbose

# Stream model output token-by-token as it is generated.
forge run "Write a short product tagline" --stream

# See the model registry and pricing the router reasons over.
forge models

# Verify the audit log hasn't been tampered with.
forge audit

--verbose streams a live trace so you can watch the supervisor plan, route each call, run tools in the sandbox, and tally cost in real time.


Quickstart (library)

A custom tool + a single agent

import asyncio
from forge import Orchestrator, ToolRegistry, tool, calculator

@tool
def fx_convert(amount: float, rate: float) -> float:
    """Convert an amount using an FX rate.

    Args:
        amount: The amount in the source currency.
        rate: The exchange rate to apply.
    """
    return round(amount * rate, 2)

async def main() -> None:
    tools = ToolRegistry([calculator, fx_convert])
    async with Orchestrator() as forge:
        result = await forge.run(
            "Convert 250 USD to EUR at 0.92 and then add a 3 EUR fee",
            mode="single",
            tools=tools,
        )
        print(result.output)

asyncio.run(main())

Parallel supervisor — workers run concurrently, one per subtask

import asyncio
from forge import Orchestrator, ForgeConfig, BudgetConfig

async def main() -> None:
    config = ForgeConfig(budget=BudgetConfig(max_workers=3, max_usd_per_run=0.25))
    async with Orchestrator(config) as forge:
        result = await forge.run(
            "Summarize our product, draft a pricing page, and write an FAQ",
            mode="supervisor",
        )
        print(result.output)
        print(result.usage.format_table())

asyncio.run(main())

Retrieval-augmented generation (RAG), offline

import asyncio
from forge import InMemoryVectorStore

async def main() -> None:
    store = InMemoryVectorStore()
    await store.add("Forge routes cheap tasks to small models to save cost.")
    await store.add("The audit log is hash-chained and tamper-evident.")
    hits = await store.search("how does Forge keep costs down?", k=1)
    print(hits[0].text)   # -> the cost-routing fact

asyncio.run(main())

See examples/ for runnable end-to-end scripts, including enterprise governance (RBAC + budgets + audit verification).


Architecture

                         ┌──────────────────────────────────────────────┐
                         │                Orchestrator                  │
                         │  access control · sanitization · accounting  │
                         └───────────────┬──────────────────────────────┘
                                         │  RunContext (per run)
              ┌──────────────────────────┼───────────────────────────┐
              ▼                          ▼                           ▼
      ┌───────────────┐         ┌─────────────────┐          ┌──────────────┐
      │  Supervisor   │ spawns  │   Worker Agents │  call    │ Tool Sandbox │
      │  plan→delegate│────────▶│  reason ↔ act   │─────────▶│ allow/deny + │
      │  →synthesize  │ (gather)│  (parallel loop)│          │  timeouts    │
      └───────┬───────┘         └────────┬────────┘          └──────────────┘
              │                          │
              │      ┌───────────────────┴───────────────┐
              ▼      ▼                                    ▼
       ┌────────────────┐                        ┌───────────────────────────┐
       │  Model Router  │  picks model by        │  Model Providers          │
       │ cost / quality │  strategy + budget ───▶│ Anthropic · OpenAI · Echo │
       └───────┬────────┘                        └───────────────────────────┘
               │ pricing
               ▼
       ┌────────────────┐   cross-cutting, on every step:
       │ Model Registry │   Usage/Cost · Event Bus · Audit Log · Redaction
       └────────────────┘

Every layer is swappable:

Layer Default Swap in…
Provider Echo (offline), Anthropic, OpenAI Any ModelProvider (local, Bedrock, Vertex, …)
Routing balanced strategy Your own strategy / fixed model
Memory In-memory vector store Any Memory backend (pgvector, Pinecone, …)
Tools calculator, utc_now Any @tool function
Audit Hash-chained JSONL Forward events to your SIEM via the event bus

Enterprise use cases

Forge is a force multiplier wherever a workflow is multi-step, judgment-heavy, and needs a paper trail:

  • Revenue & RevOps — enrich a lead, draft tailored outreach, compute discounts within policy, and log every step for the deal record.
  • Customer support — triage a ticket, retrieve the right KB articles (RAG), draft a grounded reply, and escalate by policy.
  • Finance & operations — reconcile figures with a sandboxed calculator, summarize variances, and produce an auditable trail for controllers.
  • Compliance & risk — run document review where every action is recorded in a tamper-evident log, with PII redacted and access role-gated.
  • Engineering — fan out research across a codebase with a supervisor, keeping cheap models on grunt work and frontier models on the hard reasoning.

The common thread: Forge lets you say yes to production because the governance questions already have answers.


Security & compliance posture

  • Least privilege by default. Dangerous tools (network egress, filesystem) are denied unless you explicitly allowlist them.
  • Defense in depth. Input sanitization + prompt-injection heuristics sit in front of every untrusted goal; tool execution is bounded by timeouts.
  • Provable history. The audit log is append-only and hash-chained; forge audit (or Orchestrator.verify_audit()) detects any tampering.
  • Privacy aware. PII redaction runs before anything is written to logs or audit.
  • Access control. RBAC gates sensitive operations; map roles to your IdP groups.

Forge gives you strong application-level controls. It is not a substitute for OS- level isolation when executing untrusted code — for that, run tools in a container or microVM behind the same ToolSandbox interface. The design makes that a drop-in.


Configuration

Configuration layers (lowest to highest priority): defaults → forge.toml → environment.

# forge.toml
[routing]
strategy = "balanced"          # cost_optimized | quality_first | balanced | fixed
default_model = "claude-opus-4-8"

[budget]
max_usd_per_run = 0.50
max_steps_per_agent = 12
max_workers = 6

[security]
detect_prompt_injection = true
tool_timeout_seconds = 30
# allow_tools = ["calculator", "http_get"]   # uncomment to permit a dangerous tool

[compliance]
audit_enabled = true
redact_pii = true
data_region = "eu-west-1"

Common environment variables: ANTHROPIC_API_KEY, OPENAI_API_KEY, FORGE_DEFAULT_MODEL, FORGE_ROUTING_STRATEGY, FORGE_MAX_USD_PER_RUN, FORGE_LOG_LEVEL, FORGE_JSON_LOGS, FORGE_AUDIT_ENABLED, FORGE_REDACT_PII. See .env.example.


Development

git clone https://github.com/sekacorn/AgentForge.git
cd AgentForge
python -m venv .venv && source .venv/bin/activate   # Windows: .venv\Scripts\activate
pip install -e ".[all,dev]"

pytest            # run the test suite (offline, no API key needed)
ruff check .      # lint
mypy forge        # strict type-check

The entire 47-test suite runs offline against the deterministic provider — fast, hermetic, and free. CI runs the same checks (ruff, ruff format, mypy strict, pytest) on Python 3.11, 3.12, and 3.13.


Contributing

Contributions are very welcome — see CONTRIBUTING.md. Please keep the core typed (mypy --strict) and tested.

Good first contributions

  • New model provider (Ollama, Bedrock, Vertex) — implement one method.
  • New built-in tool (web search, file read, database query).
  • Durable memory backend (SQLite-VSS, pgvector, Redis).
  • Routing strategy (a custom cost/quality tradeoff).
  • Example workflows that show Forge solving a real business problem.

License

MIT — free for commercial and private use. Build something that multiplies your team.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentforge_oss-0.2.0.tar.gz (64.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agentforge_oss-0.2.0-py3-none-any.whl (74.2 kB view details)

Uploaded Python 3

File details

Details for the file agentforge_oss-0.2.0.tar.gz.

File metadata

  • Download URL: agentforge_oss-0.2.0.tar.gz
  • Upload date:
  • Size: 64.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for agentforge_oss-0.2.0.tar.gz
Algorithm Hash digest
SHA256 012521f3783c9aa54a0a0042df297cfd6960590d3ce6fc4ebc88f54fdf4b3465
MD5 19fb8c0b809a6196801c71e0d249ce51
BLAKE2b-256 47872091ba59f9ea99006097208085c3abee94b3627ca200f64b20a458f469dd

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentforge_oss-0.2.0.tar.gz:

Publisher: release.yml on sekacorn/AgentForge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file agentforge_oss-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: agentforge_oss-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 74.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for agentforge_oss-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1c58da3d84295409dce45cbf3725423074ded9304152639793665f68ad43775e
MD5 9b845f3c557d380a95b59cfd5f04b1e7
BLAKE2b-256 b39fc330ea628ac10f3ac55b593df9bc41f63db3a4c65045c03595e642160e56

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentforge_oss-0.2.0-py3-none-any.whl:

Publisher: release.yml on sekacorn/AgentForge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page