Skip to main content

Memory-first multi-agent orchestrator built on AgentAZAll

Project description

AZClaw — Memory-First Multi-Agent Orchestrator

Three agents. Shared persistent memory. Runs for hours. Context never grows.

AZClaw is a multi-agent orchestrator built on AgentAZAll. It runs multiple LLM agents in rounds, using persistent memory instead of conversation history. The context window stays small. Speed stays constant. Agents recall what they need, when they need it.

Validated in production: 3 agents, 199 rounds, 8 hours 46 minutes, 402 memories, 52 Python files, zero errors, $0 in API costs.

Install

pip install azclaw

This automatically installs agentazall as a dependency.

20-Line Quickstart

from azclaw import Agent, Orchestrator

endpoint = "http://localhost:8080/v1/chat/completions"

architect = Agent("architect",
    role="You design software architecture. Be specific about file structure and data models.",
    endpoint=endpoint)

developer = Agent("developer",
    role="You write Python code. Follow the Architect's design exactly.",
    endpoint=endpoint, can_write=True)

reviewer = Agent("reviewer",
    role="You review code for bugs, security issues, and design violations.",
    endpoint=endpoint)

orch = Orchestrator(agents=[architect, developer, reviewer])
orch.set_task("Build a FastAPI REST API for a todo app with SQLite backend.")
orch.run(max_rounds=20)

print(f"Files: {orch.stats.files_written}")
print(f"Memories: {orch.stats.memories_stored}")

How It Works

Memory-First Architecture

Every other multi-agent framework stuffs the full conversation history into each API call. By round 30, context overflows. Speed collapses. Agents forget earlier decisions.

AZClaw takes the opposite approach:

  • Only the last round goes into context
  • Everything else is stored via remember() and retrieved via recall()
  • Context stays at 2–9K tokens regardless of how many rounds have passed
  • Speed at round 199 is the same as round 1

Role-Based Tool Access

Role recall remember read_file list_files write_file run_python
Architect yes yes yes yes no no
Developer yes yes yes yes yes yes
Reviewer yes yes yes yes no no

Only the Developer writes code. The Architect designs. The Reviewer validates. This prevents agents from overwriting each other's work.

Dedup Detection

If an agent calls the same tool with the same arguments twice in one turn, the duplicate is skipped automatically. If ALL calls in a turn are duplicates, the agent is forced to analyze existing results instead of looping.

Graceful Stop

touch STOP          # creates STOP file — orchestrator finishes current round and exits
azclaw stop         # same thing via CLI

Or send SIGINT/SIGTERM — the orchestrator catches the signal and exits cleanly after the current round.

CLI

# Run with a topic file
azclaw run --topic topics/my-migration.json

# Run with a simple task
azclaw run --task "Build a REST API for user management" --endpoint http://localhost:8080/v1/chat/completions

# Run with custom agents
azclaw run --task "Review this codebase" --agents "designer:http://localhost:8200/v1/chat/completions,coder:http://localhost:8201/v1/chat/completions"

# Resume from checkpoint
azclaw run --topic my-topic.json --resume logs/my_topic_checkpoint.json

# Stop gracefully
azclaw stop

Topic Configuration

For complex multi-phase tasks, define a topic JSON:

{
  "title": "COBOL to Python Migration",
  "system_context": "You are migrating a COBOL banking application to Python/FastAPI.",
  "initial_prompt": "Begin by analyzing the COBOL source files...",
  "phases": [
    {
      "name": "Discovery",
      "rounds": [1, 20],
      "focus": "Analyze all programs, map dependencies",
      "source_files": ["cbl/COSGN00C.cbl", "cbl/COACTUPC.cbl"]
    },
    {
      "name": "Data Layer",
      "rounds": [21, 45],
      "focus": "Convert copybooks to SQLAlchemy models"
    }
  ],
  "coherence_probes": [
    {"after_round": 25, "question": "What database engine did we choose and why?"}
  ],
  "agent_roles": {
    "architect": "You design the Python architecture. Map COBOL constructs to modern equivalents.",
    "developer": "You write Python code that preserves COBOL business logic exactly.",
    "reviewer": "You verify generated code against the original COBOL source."
  }
}

Custom Tools

Register custom tools with a decorator:

from azclaw import build_default_registry

registry = build_default_registry()

@registry.tool("search_code", "Search codebase with ripgrep",
               {"query": "string", "file_type": "string"})
def search_code(query: str, file_type: str = "py", _ctx=None):
    import subprocess
    r = subprocess.run(["rg", query, "--type", file_type, "."],
                       capture_output=True, text=True, timeout=10)
    return r.stdout[:5000] or "[No matches]"

# Pass to orchestrator
orch = Orchestrator(agents=[...], registry=registry)

Checkpointing & Resume

AZClaw saves checkpoints every 5 rounds to logs/. If a run is interrupted, resume from where it stopped:

azclaw run --topic my-topic.json --resume logs/my_topic_checkpoint.json

The checkpoint preserves: round number, all agent stats, memory counts, and the last 100 history messages.

The 9-Hour Proof

Three NVIDIA Nemotron-3-Nano-30B-A3B agents (3B active params each, MoE) ran for 8 hours 46 minutes on an AMD EPYC server with 8 GPUs, migrating AWS CardDemo (50,000 lines of COBOL) to Python:

Metric Value
Rounds 199
Runtime 8h 46m
Python files 52 (2,543 lines)
Memories 402
Tool calls 1,599
Tokens 24.3M total
Context per agent 2–9K (flat)
Speed 97–137 tok/s (start to finish)
Errors 0
Cloud API cost $0.00

Download the full results: carddemo-agentazall-results.zip (743KB)

Architecture

~960 lines of Python. 1 dependency (agentazall). That's it.

azclaw/
├── __init__.py       (20 lines)   Public API
├── agent.py          (100 lines)  Agent class — LLM + role + tools
├── orchestrator.py   (350 lines)  Round loop, memory-first context, checkpoints
├── tools.py          (260 lines)  ToolRegistry + 6 built-in tools
├── llm.py            (100 lines)  urllib-only OpenAI client
├── topic.py          (70 lines)   Phase/topic config loader
└── cli.py            (100 lines)  CLI entry point

Requirements

  • Python 3.10+
  • Any OpenAI-compatible LLM endpoint (llama.cpp, vLLM, Ollama, LM Studio, OpenRouter)
  • At least one GPU with 8GB+ VRAM (for a small model)

Links

License

AGPL-3.0 — same as AgentAZAll.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

azclaw-0.1.0.tar.gz (29.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

azclaw-0.1.0-py3-none-any.whl (27.5 kB view details)

Uploaded Python 3

File details

Details for the file azclaw-0.1.0.tar.gz.

File metadata

  • Download URL: azclaw-0.1.0.tar.gz
  • Upload date:
  • Size: 29.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for azclaw-0.1.0.tar.gz
Algorithm Hash digest
SHA256 9423236ad8d99604a41835be837860d62c549eb0a50e79d9aecff947f7e3f995
MD5 4c49a069d800152ef10f38dc3c73b8b4
BLAKE2b-256 f77c0e50fe36eaa807f7ac005a08857f8bd71fa0b5e98b4f71404edc8dbd41ee

See more details on using hashes here.

File details

Details for the file azclaw-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: azclaw-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 27.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for azclaw-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 107bd8892475470311725e78861426a3385072f0c7fc9fd07705ef98053abbf8
MD5 3e7fd5e0a653b2cd36d775a27f420f61
BLAKE2b-256 78ce6a6bdb1d418501769f7f056053299fae37c48c02edb71e4a06b896c98116

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page