agentloom

Production-ready agentic workflow orchestrator with native observability, resilience, and cost control.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

cchinchilla-dev

These details have not been verified by PyPI

Project description

Deterministic LLM workflow orchestration with native observability, resilience, and cost control.

Why AgentLoom?
Quick Start
Architecture
Workflow Definition (YAML)
Python DSL
Observability
Deploy
Why not autonomous agents?
Development
Contributing
License

Why AgentLoom?

Existing frameworks (LangGraph, CrewAI, AutoGen) treat observability and resilience as afterthoughts. AgentLoom is built from the ground up for production: circuit breakers, rate limiting, cost tracking, and OpenTelemetry traces are part of the core design — not plugins.

Feature	LangGraph	CrewAI	AutoGen	AgentLoom
Workflow definition	Python API	Decorators	Agent chat	YAML + Python DSL
Observability	LangSmith ($)	Minimal	Minimal	OTel + Prometheus + Grafana
Circuit breaker	No	No	No	Built-in
Cost tracking	No	No	No	Native with budgets
Multi-provider fallback	Manual	No	No	Automatic
Dependencies	Heavy	Medium	Medium	Minimal

Quick Start

# Install
pip install agentloom

# Install with observability (OTel + Prometheus)
pip install agentloom[all]

# Run a workflow
export OPENAI_API_KEY=sk-...
agentloom run examples/01_simple_qa.yaml

# Or with Ollama (free, local)
agentloom run examples/01_simple_qa.yaml --provider ollama --model phi4

# Validate a workflow
agentloom validate examples/03_router_workflow.yaml

# Visualize the DAG
agentloom visualize examples/03_router_workflow.yaml

Architecture

+-----------------------------------------------------+
|                   CLI / Python API                  |
+-----------------------------------------------------+
|                   Workflow Engine                   |
|  +-----------+  +-----------+  +---------------+    |
|  |DAG Parser |  | Scheduler |  | State Manager |    |
|  |& Validator|  |  (anyio)  |  |  (Pydantic)   |    |
|  +-----------+  +-----------+  +---------------+    |
+-----------------------------------------------------+
|                   Step Executors                    |
|  +--------+ +---------+ +------+ +------------+     |
|  |LLM Call| |Tool Exec| |Router| | Subworkflow|     |
|  +--------+ +---------+ +------+ +------------+     |
+-----------------------------------------------------+
|                  Provider Gateway                   |
|  +-----------------------------------------------+  |
|  | OpenAI | Anthropic | Google | Ollama           | |
|  | + Fallback | Circuit Breaker | Rate Limiter    | |
|  +-----------------------------------------------+  |
+-----------------------------------------------------+
|              Observability (optional)               |
|  +------------+  +----------+  +----------+         |
|  | OTel Traces|  |Prometheus|  | JSON Logs|         |
|  +------------+  +----------+  +----------+         |
+-----------------------------------------------------+

Workflow Definition (YAML)

name: classify-and-respond
config:
  provider: openai
  model: gpt-4o-mini
  budget_usd: 0.50

state:
  user_input: ""

steps:
  - id: classify
    type: llm_call
    system_prompt: "Classify as: question, complaint, or request."
    prompt: "Classify: {state.user_input}"
    output: classification

  - id: route
    type: router
    depends_on: [classify]
    conditions:
      - expression: "state.classification == 'question'"
        target: answer
    default: general_response

  - id: answer
    type: llm_call
    depends_on: [route]
    prompt: "Answer: {state.user_input}"
    output: response

  - id: general_response
    type: llm_call
    depends_on: [route]
    prompt: "Help with: {state.user_input}"
    output: response

Python DSL

from agentloom.core.dsl import workflow

wf = (
    workflow("my-workflow", provider="ollama", model="phi4")
    .set_state(question="What is Python?")
    .add_llm_step("answer", prompt="Answer: {question}", output="answer")
    .build()
)

Observability

Every workflow step emits OpenTelemetry traces and Prometheus metrics out of the box. No external SaaS required — the full stack runs alongside your workloads.

# Start Prometheus + Grafana + Jaeger
cd deploy && docker compose up -d

# Access:
#   Grafana:    http://localhost:3000
#   Prometheus: http://localhost:9090
#   Jaeger:     http://localhost:16686

See Dashboard Documentation for panel descriptions, metrics reference, and troubleshooting.

Deploy

AgentLoom is designed to run anywhere — from a single Docker container on your laptop to a fully orchestrated Kubernetes cluster with GitOps and observability. Every deployment method is production-hardened with non-root containers, read-only filesystems, Pod Security Standards enforcement, and network policies.

The CLI processes a workflow and exits. There is no long-running server, no HTTP API, and no persistent connections. This makes Kubernetes Jobs (not Deployments) the correct primitive: finite execution, automatic retries, scheduled runs via CronJobs, and clean resource isolation per workflow.

Docker

The fastest way to run a workflow. The multi-stage Dockerfile produces a minimal image (~120MB) with a non-root user and read-only filesystem.

docker build -t agentloom .
docker run --rm -e OPENAI_API_KEY=sk-... \
  -v ./examples:/workflows:ro \
  agentloom run /workflows/01_simple_qa.yaml

Kubernetes (Kustomize)

Plain YAML manifests organized with Kustomize overlays. Three environments are provided, each with progressively stricter security and resource controls:

dev: minimal resources, no NetworkPolicy, latest tag for fast iteration.
staging: moderate resources, NetworkPolicy enabled, CI image tag.
production: strict NetworkPolicy (no Ollama egress), activeDeadlineSeconds hard timeout, pinned image version.

kubectl apply -k deploy/k8s/overlays/dev
kubectl logs job/agentloom-workflow -n agentloom

Helm

The recommended method for teams that need parameterized deployments. The chart packages all Kubernetes resources with built-in input validation — deploying without a workflow definition fails at render time, not at runtime.

helm install agentloom deploy/helm/agentloom \
  -n agentloom --create-namespace \
  --set workflow.definition="$(cat examples/01_simple_qa.yaml)" \
  --set provider.existingSecret=my-secret

Supports Job and CronJob modes, configurable NetworkPolicies, ResourceQuotas, and optional namespace creation with PSS labels.

Terraform

Provisions a complete local development environment in one command: a kind cluster with agentloom, plus the full observability stack (OTel Collector, Prometheus, Grafana, Jaeger). Set enable_observability = false for a lightweight setup without metrics and traces.

cd deploy/terraform
cp terraform.tfvars.example terraform.tfvars
terraform init && terraform apply

After apply, Grafana is available at localhost:3000, Prometheus at localhost:9090, and Jaeger at localhost:16686 — all pre-configured with agentloom dashboards and datasources.

ArgoCD

GitOps deployment with automated sync, self-heal, and retry policies. ArgoCD watches the Helm chart in the repository and syncs changes automatically. The Application CRD handles Kubernetes Job immutability via Replace=true and ignoreDifferences on selectors.

kubectl apply -f deploy/argocd/application.yaml

See deploy/INFRASTRUCTURE.md for the full deployment guide, security hardening details, Helm chart reference, and CI/CD pipeline documentation.

Why not autonomous agents?

Most LLM frameworks focus on autonomous agents: self-directed reasoning, multi-agent delegation, unbounded tool loops. This works for demos and open-ended research, but breaks down in production where you need predictable costs, debuggable failures, and SLA compliance.

AgentLoom is not an autonomous agent framework. There are no self-directed agents, no unbounded loops, no emergent behavior. It is a deterministic workflow orchestrator that uses LLMs as execution steps within a declared DAG.

The difference matters:

You define the DAG, not the LLM. Steps, dependencies, and routing logic are declared upfront in YAML. The model generates text within a step — it does not decide what runs next. Routers use explicit boolean conditions, not LLM judgement.
Observability is not optional. Every step emits OpenTelemetry traces and Prometheus metrics. You can see exactly what ran, how long it took, and how much it cost. Autonomous agents are notoriously hard to debug; a static DAG with full tracing is not.
Cost is bounded. Budget limits, circuit breakers, and rate limiters are first-class. A runaway autonomous agent can burn through an API budget in minutes. A workflow with budget_usd: 0.50 cannot.
Fallback is structural. If OpenAI is down, the gateway falls back to Anthropic or Ollama automatically. This is a routing decision at the infrastructure level, not an agent "choosing" a provider.

Autonomous agent frameworks solve a real problem — open-ended tasks where the execution path cannot be known in advance. But most LLM workloads in production are not open-ended. They are pipelines: classify, enrich, route, generate, validate. For those, you want predictability and control, not autonomy. That is what AgentLoom is for.

Development

uv sync --group dev --all-extras   # install with all extras
uv run pytest                       # 458 tests, ~5s
uv run ruff check src/ tests/      # lint (ruff replaces flake8+isort)
uv run ruff format src/ tests/     # autoformat
uv run mypy src/                   # strict type checking

Pre-commit hooks run ruff automatically on staged files — see CONTRIBUTING.md for the full workflow.

Contributing

See CONTRIBUTING.md for setup instructions, code style, and PR guidelines.

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

cchinchilla-dev

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.4.0

Apr 15, 2026

0.3.0

Apr 12, 2026

This version

0.2.0

Mar 30, 2026

0.1.2

Mar 26, 2026

0.1.1

Mar 22, 2026

0.1.0

Mar 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agentloom-0.2.0-py3-none-any.whl (66.7 kB view details)

Uploaded Mar 30, 2026 Python 3

File details

Details for the file agentloom-0.2.0-py3-none-any.whl.

File metadata

Download URL: agentloom-0.2.0-py3-none-any.whl
Upload date: Mar 30, 2026
Size: 66.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for agentloom-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`07d22379853bf573cd7fabc4459df66350f2d32e73a72bb4a319136f40591607`
MD5	`80478805c70c502d57461f8b04528d28`
BLAKE2b-256	`d0f87637b101a712c1ccd04cf0b08294461cae61fe40481f8c25c21efd4358f1`

See more details on using hashes here.

Provenance

The following attestation bundles were made for agentloom-0.2.0-py3-none-any.whl:

Publisher: release.yml on cchinchilla-dev/agentloom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: agentloom-0.2.0-py3-none-any.whl
- Subject digest: 07d22379853bf573cd7fabc4459df66350f2d32e73a72bb4a319136f40591607
- Sigstore transparency entry: 1198863151
- Sigstore integration time: Mar 30, 2026
Source repository:
- Permalink: cchinchilla-dev/agentloom@6edc54bfc9bfbb2ac31f5c31699b3f95dea4937b
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/cchinchilla-dev
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@6edc54bfc9bfbb2ac31f5c31699b3f95dea4937b
- Trigger Event: push

agentloom 0.2.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Table of Contents

Why AgentLoom?

Quick Start

Architecture

Workflow Definition (YAML)

Python DSL

Observability

Deploy

Docker

Kubernetes (Kustomize)

Helm

Terraform

ArgoCD

Why not autonomous agents?

Development

Contributing

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes

Provenance