Lightweight, hackable multi-agent orchestration lab (CLI + Python) with transcripts, checkpoints, budgets, and pluggable providers/tools.
Project description
Agentry Lab — Multi‑Agent Orchestration for Experiments
Serious tooling, delightfully unserious outcomes. 😏
New: Let humans heckle the agents. Schedule user turns and poke the room via CLI/API. Try:
agentrylab say user_in_the_loop.yaml demo "Hello!"thenagentrylab run user_in_the_loop.yaml --thread-id demo --resume --max-iters 1🎤
A lightweight, hackable lab for building and evaluating multi‑agent workflows. Define your lab room (agents, tools, providers, schedules) in YAML, then run and iterate quickly from the CLI or Python. Stream outputs, save transcripts, stash checkpoints — because sometimes you want agents to argue… on purpose.
🚀 Get Started in 2 Minutes
- Install:
pip install agentrylab(or see installation below) - Run:
agentrylab run solo_chat.yaml --max-iters 3 - Done! Watch your agents chat away! 🎉
💡 New to multi-agent systems? Start with Solo Chat - it's perfect for beginners and works great with local models like Ollama/llama3!
✨ Why AgentryLab?
Because single agents are boring. 🤖
- 📦 YAML‑first presets for agents/advisors/moderator/summarizer (your config, your rules)
- 🔌 Pluggable LLM providers (OpenAI, Ollama) and tools (ddgs, Wolfram Alpha)
- 📡 Streaming CLI with resume support and transcript/DB persistence (forget nothing, replay everything)
- ⏳ Smart budgets for tools (per‑run/per‑iteration) with shared‑per‑tick semantics (no more runaway tool spam)
- 🧩 Small, readable runtime: nodes, scheduler, engine, state (batteries included, drama optional)
- 🫵 Human‑in‑the‑loop turns: schedule
usernodes and poke runs from CLI/API (agentrylab say …)
📋 Requirements
- 🐍 Python 3.11+
- 🧰 Virtual environment (recommended; sanity‑preserving)
- 🖥️ Optional: Ollama for local models (default:
http://localhost:11434) - 🔑 API keys as needed (e.g.,
OPENAI_API_KEY,WOLFRAM_APP_ID) — bring your own secrets
💾 Installation
Option 1: From PyPI (Recommended)
pip install agentrylab
Option 2: From Source (Development)
git clone https://github.com/Alexeyisme/agentrylab.git
cd agentrylab
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install -U pip
pip install -e .
🔧 Environment Setup
Create a .env file (loaded via python-dotenv) with any secrets you need:
# For OpenAI models (optional)
OPENAI_API_KEY=sk-...
# For Wolfram Alpha (optional)
WOLFRAM_APP_ID=...
# For Ollama (optional, defaults to localhost:11434)
OLLAMA_BASE_URL=http://localhost:11434
💡 Pro tip: You can start with just Ollama (free, local) and add API keys later!
🚀 Quick Start
CLI Quickstart
Spin up a room and let the sparks fly:
# Simple chat (works with Ollama/llama3)
agentrylab run solo_chat.yaml --max-iters 3
# Or with a custom topic
JOKE_TOPIC="remote work" agentrylab run standup_club.yaml --max-iters 4
# Or a debate (needs OpenAI API key)
agentrylab run debates.yaml --max-iters 4 --thread-id demo
User Messages (User-in-the-Loop)
Let a human chime in via API or CLI, and optionally schedule a user turn in cadence.
# 1) Post a user message into a thread
agentrylab say user_in_the_loop.yaml demo "Hello from Alice!"
# 2) Run one iteration to consume it (user turn then assistant)
agentrylab run user_in_the_loop.yaml --thread-id demo --resume --max-iters 1
Python API:
from agentrylab import init
lab = init("src/agentrylab/presets/user_in_the_loop.yaml", experiment_id="demo")
lab.post_user_message("Hello from Alice!", user_id="user:alice")
lab.run(rounds=1)
Python API Quickstart
Orchestrate from Python with minimal fuss:
from agentrylab import init, list_threads
# 1. Create lab (using solo_chat preset - perfect for llama3!)
lab = init("src/agentrylab/presets/solo_chat.yaml",
experiment_id="my-chat",
prompt="Tell me about your favorite hobby!")
# 2. Run with callback
def callback(event):
if event.get("event") == "provider_result":
print(f"Agent responded: {event.get('content_len', 0)} chars")
status = lab.run(rounds=3, stream=True, on_event=callback)
# 3. Show conversation
for msg in lab.state.history:
print(f"[{msg['role']}]: {msg['content']}")
# 4. Resume with new topic
lab.state.objective = "Now tell me about your dream vacation!"
lab.run(rounds=2)
# 5. List threads
threads = list_threads("src/agentrylab/presets/solo_chat.yaml")
Python examples:
user_in_the_loop_quick.py— post once and run N roundsuser_in_the_loop_interactive.py— type a line, run a round, repeat
📝 Note: Output streams each iteration ("=== New events ===") and prints a final tail of the last N transcript entries. Transcripts are written to
outputs/*.jsonland checkpoints tooutputs/checkpoints.db.
🖥️ CLI Commands
Basic Commands
# Run a preset
agentrylab run <preset.yaml> [--thread-id ID] [--max-iters N] [--show-last K]
# Inspect a thread's checkpoint
agentrylab status <preset.yaml> <thread-id>
# List all known threads
agentrylab ls <preset.yaml>
Common Options
--max-iters N: Run for N iterations (default: varies by preset)--thread-id ID: Use specific thread ID (enables resume)--show-last K: Show last K messages at the end--stream/--no-stream: Enable/disable real-time streaming (default: enabled)--resume/--no-resume: Resume from checkpoint or start fresh (default: resume)
📚 Full docs: See
src/agentrylab/docs/CLI.mdfor complete command reference.
User-in-the-loop:
agentrylab say <preset.yaml> <thread-id> "message" [--user-id USER]appends a user message into a thread.- Works with scheduled user nodes (role
user) so messages are consumed on their turns.
⚙️ Configuration
Describe your room in YAML; everything else clicks into place.
- Presets: shipped with the package; the CLI accepts packaged names like
solo_chat.yaml(file paths work too) - Providers: OpenAI (HTTP), Ollama; add your own under
runtime/providers - Tools:
ddgssearch, Wolfram Alpha; add your own underruntime/tools - Scheduler: Round‑robin and Every‑N; build your own in
runtime/scheduler
🎭 Built-in Presets
Have fun out of the box — llama3‑friendly and non‑strict by default.
🎤 Solo Chat (solo_chat.yaml) - Perfect for beginners!
- What: Single friendly agent ready to chat about anything
- Best for: Testing, simple conversations, llama3 users
- Run:
agentrylab run solo_chat.yaml --max-iters 3 - Topic:
CHAT_TOPIC="your topic"
🎭 Stand‑Up Club (standup_club.yaml) - Comedy gold!
- What: Two comedians riff on a topic, punch‑up advisor adds tweaks, MC closes the set
- Best for: Entertainment, creative writing, humor
- Run:
JOKE_TOPIC="airports" agentrylab run standup_club.yaml --max-iters 6 - Topic:
JOKE_TOPIC="your topic"
🧠 Drifty Thoughts (drifty_thoughts.yaml) - Free-form thinking
- What: Three "thinkers" drift playfully; gentle advisor nudges; optional summarizer
- Best for: Creative brainstorming, philosophical discussions
- Run:
TOPIC="surprising ideas" agentrylab run drifty_thoughts.yaml - Topic:
TOPIC="your topic"
🔬 Research Collaboration (research.yaml) - Academic vibes
- What: Two scientists brainstorm, style coach gives clarity, moderator emits JSON actions
- Best for: Research, academic discussions, structured thinking
- Run:
TOPIC="curious scientific question" agentrylab run research.yaml - Topic:
TOPIC="your topic"
🛋️ Therapy Session (therapy_session.yaml) - Compassionate chat
- What: Reflective client and gentle therapist; summarizer offers compassionate wrap‑up
- Best for: Emotional discussions, self-reflection, supportive conversations
- Run:
TOPIC="something on your mind" agentrylab run therapy_session.yaml - Topic:
TOPIC="your topic"
🔍 DDG Quick Summary (ddg_quick_summary.yaml) - Web research
- What: One agent searches DuckDuckGo and writes a 5‑bullet web summary with URLs
- Best for: Quick research, web summaries, fact-finding
- Run:
SUMMARY_TOPIC="your topic" agentrylab run ddg_quick_summary.yaml - Topic:
SUMMARY_TOPIC="your topic"
☕ Small Talk (small_talk.yaml) - Casual chat
- What: Two friendly voices chat; host recaps every few turns
- Best for: Casual conversations, social interactions
- Run:
SMALL_TALK_TOPIC="coffee rituals" agentrylab run small_talk.yaml - Topic:
SMALL_TALK_TOPIC="your topic"
💡 Brainstorm Buddies (brainstorm_buddies.yaml) - Idea generation
- What: Two idea buddies riff; scribe pulls a shortlist
- Best for: Brainstorming, creative ideation, problem-solving
- Run:
BRAINSTORM_TOPIC="rainy day activities" agentrylab run brainstorm_buddies.yaml - Topic:
BRAINSTORM_TOPIC="your topic"
❓ Follow‑Up Q&A (follow_up.yaml) - Structured interviews
- What: Explainer → interviewer → explainer → interviewer → summarizer
- Best for: Educational content, interviews, structured Q&A
- Run:
FOLLOWUP_TOPIC="solar panels at home" agentrylab run follow_up.yaml - Topic:
FOLLOWUP_TOPIC="your topic"
🏛️ Debates (debates.yaml) - Formal arguments
- What: Pro/con debaters with moderator and evidence-based arguments
- Best for: Formal debates, argument analysis, structured discussions
- Run:
agentrylab run debates.yaml --max-iters 4 - Note: Requires OpenAI API key for best results
🗣️ Simple Argument (argue.yaml) - Casual debates
- What: Two agents having a natural debate without strict rules
- Best for: Casual arguments, opinion discussions, llama3 users
- Run:
DEBATE_TOPIC="Should remote work become standard?" agentrylab run argue.yaml - Topic:
DEBATE_TOPIC="your topic"
💡 Pro tip: Start with Solo Chat for testing, then try Stand‑Up Club for fun!
📚 More tips: Seesrc/agentrylab/docs/PRESET_TIPS.mdfor advanced configuration.
👤 User in the Loop (user_in_the_loop.yaml) — Human turn in the cadence
- What: A scheduled
usernode consumes queued user messages before the assistant - Best for: Interactive runs where a human can steer between turns
- Try:
agentrylab say user_in_the_loop.yaml demo "Hi agents!"agentrylab run user_in_the_loop.yaml --thread-id demo --resume --max-iters 1
🗣️ Solo Chat (User Turn) (solo_chat_user.yaml) — Classic chat with a scheduled user
- What: A scheduled
usernode (user:you) before a single assistant - Best for: Simple human-steered chats using local models (llama3)
- Try:
agentrylab say solo_chat_user.yaml demo "Hello!" --user-id user:youagentrylab run solo_chat_user.yaml --thread-id demo --resume --max-iters 1
💰 Tool Budgets
Control how many times tools can be called to prevent runaway costs:
per_run_max: Total calls per tool across the entire runper_iteration_max: Calls per engine tick (resets each tick)- Scope: Enforced per tool ID, shared across agents in the same tick
- Minima (
per_run_min,per_iteration_min) are advisory (not enforced)
📜💾 Persistence
Transcripts for storytelling; checkpoints for recovery.
- 📜 Transcript JSONL:
outputs/<thread-id>.jsonl(human-readable conversation logs) - 💾 Checkpoints (SQLite):
outputs/checkpoints.db(resume from any point) - ⏭️ Resume:
--resume(default) continues from last checkpoint;--no-resumestarts fresh - 🧠 Schemas: See
src/agentrylab/docs/PERSISTENCE.mdfor detailed field definitions - ⏱️ Timestamps: All recorded as Unix epoch seconds (UTC)
Cleaning outputs (all threads)
- Remove everything (default paths):
rm -rf outputs/ - Or per-thread:
agentrylab ls <preset.yaml>thenagentrylab reset <preset.yaml> <thread-id> --delete-transcript
🏗️ Architecture (at a glance)
Simple, readable runtime components:
- Engine: Steps the scheduler, executes nodes, applies outputs/actions
- Nodes: Agent, Moderator, Summarizer, Advisor (see
runtime/nodes/*) - Providers: Thin HTTP adapters (OpenAI, Ollama)
- Tools: Simple callables with normalized envelopes (e.g., ddgs)
- State: History window composition, budgets, message contracts, rollback
🧑💻 Development
Serious tooling for serious… tinkering.
# Install development dependencies
pip install -e .[dev]
# Lint and test
ruff check . && pytest -q
# Coverage (uses pytest-cov; default fail-under=40%)
make coverage
# or: pytest --cov=src/agentrylab --cov-branch --cov-report=term-missing
☕️ Pro tip: Keep a coffee nearby. Agents love to riff.
🐍 Python API
Basic Usage
from agentrylab import init
# Initialize a lab and run for N rounds
lab = init("src/agentrylab/presets/solo_chat.yaml",
experiment_id="my-experiment",
prompt="Tell me about your favorite hobby!")
status = lab.run(rounds=5)
print(f"Iterations: {status.iter}, Active: {status.is_active}")
# View conversation history
for msg in lab.state.history:
print(f"[{msg['role']}]: {msg['content']}")
Posting User Messages
from agentrylab import init
lab = init("src/agentrylab/presets/user_in_the_loop.yaml", experiment_id="chat-1")
# Append a user line into history and transcript; also enqueue for scheduled user nodes
lab.post_user_message("Please keep it concise.", user_id="user:alice")
lab.run(rounds=1)
One-shot Run with Streaming
from agentrylab import run
def on_event(ev: dict):
print(f"Iteration {ev['iter']}: {ev['agent_id']} ({ev['role']})")
lab, status = run(
"src/agentrylab/presets/solo_chat.yaml",
prompt="What makes jokes funny?",
experiment_id="streaming-demo",
rounds=5,
stream=True,
on_event=on_event,
)
Budget Management
from agentrylab import init
# Set budgets in preset, then inspect counters
preset = {
"id": "budget-demo",
"providers": [{"id": "p1", "impl": "tests.fake_impls.TestProvider", "model": "test"}],
"tools": [{"id": "echo", "impl": "tests.fake_impls.EchoTool"}],
"agents": [{"id": "pro", "role": "agent", "provider": "p1", "system_prompt": "You are the agent.", "tools": ["echo"]}],
"runtime": {
"scheduler": {"impl": "agentrylab.runtime.scheduler.round_robin.RoundRobinScheduler", "params": {"order": ["pro"]}},
"budgets": {"tools": {"per_run_max": 1}},
},
}
lab = init(preset, experiment_id="budget-demo-1", resume=False)
lab.run(rounds=1)
snap = lab.store.load_checkpoint("budget-demo-1")
print("Total tool calls:", snap.get("_tool_calls_run_total"))
Logging & Tracing
# Configure runtime logging/trace in the preset
preset = {
# ... providers/tools/agents ...
"runtime": {
"logs": {"level": "INFO", "format": "%(asctime)s %(levelname)s %(name)s: %(message)s"},
"trace": {"enabled": True},
"scheduler": {"impl": "agentrylab.runtime.scheduler.round_robin.RoundRobinScheduler", "params": {"order": ["pro"]}},
},
}
lab = init(preset, experiment_id="log-1")
lab.run(rounds=1)
📚 API Reference
Core Functions
init(config, *, experiment_id=None, prompt=None, user_messages=None, resume=True) -> Lab
config: YAML path, dict, or validated Preset objectexperiment_id: Logical run/thread ID; enables resumeprompt: Setscfg.objectivefor the run (used in prompts when enabled)user_messages: String or list of strings; seeds initial user message(s) into contextresume: Attempts to load checkpoint forexperiment_id
run(config, *, prompt=None, experiment_id=None, rounds=None, resume=True, stream=False, on_event=None, timeout_s=None, stop_when=None, on_tick=None, on_round=None) -> (Lab, LabStatus)
- One-shot helper; see
Lab.runfor parameters
Lab Methods
Lab.run(*, rounds=None, stream=False, on_event=None, timeout_s=None, stop_when=None, on_tick=None, on_round=None) -> LabStatus
rounds: Number of iterations to runstream: When True, callson_event(event: Event)for newly appended transcript entriestimeout_s: Optional wall-clock timeout for streaming runsstop_when: Optional predicateEvent -> bool; when returns True, run stops
Lab.stream(*, rounds=None, timeout_s=None, stop_when=None, on_tick=None, on_round=None) -> Iterator[Event]
- Generator that yields transcript events as they occur
- Optional callbacks:
on_tick(info),on_round(info)whereinfo = {"iter": int, "elapsed_s": float}
Other Lab Methods:
Lab.status(property) ->LabStatusLab.history(limit=50)->list[Event]Lab.clean(thread_id=None, delete_transcript=True, delete_checkpoint=True) -> None: Delete outputs for a threadlist_threads(config) -> list[tuple[str, float]]: List (thread_id, updated_at) in persistence
📦 Releasing
We publish on tags via GitHub Actions (see .github/workflows/release.yml).
For maintainers:
- Bump
versioninpyproject.toml - Update
CHANGELOG.md git tag -a vX.Y.Z -m 'vX.Y.Z' && git push --tags- CI builds sdist/wheel and uploads to PyPI using
PYPI_API_TOKENsecret
📋 Event Schema
from agentrylab import Event
def handle(ev: Event) -> None:
print(ev["iter"], ev["agent_id"], ev["role"], ev.get("latency_ms"))
# Keys: t, iter, agent_id, role, content (str|dict), metadata (dict|None), actions (dict|None), latency_ms
💾 Checkpoint Snapshot Fields
Returned by lab.store.load_checkpoint(thread_id) as a dict of state attributes:
thread_id: Current experiment IDiter: Iteration counterstop_flag: Stop signal for the enginehistory: In‑memory context entries{agent_id, role, content}used by prompt compositionrunning_summary: Summarizer running summary if set_tool_calls_run_total,_tool_calls_iteration: Global tool counters_tool_calls_run_by_id,_tool_calls_iter_by_id: Per‑tool counterscfg,contracts: Complex/opaque objects (implementation detail)
Note: If a legacy/opaque pickle was saved, you'll get
{ "_pickled": ... }instead
🍳 Recipes
Programmatic Preset Construction
from agentrylab import init
preset = {
"id": "programmatic",
"providers": [{"id": "p1", "impl": "agentrylab.runtime.providers.openai.OpenAIProvider", "model": "gpt-4o"}],
"tools": [],
"agents": [{"id": "pro", "role": "agent", "provider": "p1", "system_prompt": "You are the agent."}],
"runtime": {
"scheduler": {"impl": "agentrylab.runtime.scheduler.round_robin.RoundRobinScheduler", "params": {"order": ["pro"]}}
},
}
lab = init(preset, experiment_id="prog-1", user_messages=["Start topic: ..."])
lab.run(rounds=3)
Multiple Runs in a Loop
topics = ["jokes", "puns", "metaphors"]
for i, topic in enumerate(topics):
lab = init("src/agentrylab/presets/debates.yaml", experiment_id=f"exp-{i}", prompt=f"Explore {topic}")
lab.run(rounds=2)
Inspecting Transcripts
lab = init("src/agentrylab/presets/debates.yaml", experiment_id="inspect-1")
lab.run(rounds=1)
for ev in lab.history(limit=20):
print(ev["iter"], ev["agent_id"], ev["role"], str(ev["content"])[:80])
# Or read directly from the store
rows = lab.store.read_transcript("inspect-1", limit=100)
Cleaning Outputs (Transcript + Checkpoint)
from agentrylab import init
lab = init("src/agentrylab/presets/debates.yaml", experiment_id="demo-clean")
lab.run(rounds=1)
# Remove persisted outputs for this experiment
lab.clean() # or lab.clean(thread_id="some-other-id")
📄 License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agentrylab-0.1.1.tar.gz.
File metadata
- Download URL: agentrylab-0.1.1.tar.gz
- Upload date:
- Size: 84.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
30233ce17af683c6365d2e6a3feddaee3c89adc1362a4d9d0fcd30f238b440f8
|
|
| MD5 |
0219b58b9175a378a43c8c9237626903
|
|
| BLAKE2b-256 |
47ce361404def883709d656f55b010742f69fca88f4e4bdf7489cd5a0b855acd
|
File details
Details for the file agentrylab-0.1.1-py3-none-any.whl.
File metadata
- Download URL: agentrylab-0.1.1-py3-none-any.whl
- Upload date:
- Size: 97.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
667daae895d577b51cb81882f07936abe1a85ace74422f9811a4a305fc693b63
|
|
| MD5 |
dad711e092d76f0adeaab54cf232f66e
|
|
| BLAKE2b-256 |
e9bda20392d5d2aabefadbd64cbd0d256a45f8852ff97f455e2829ed5b4bac8c
|