Skip to main content

Embeddable agent execution kernel — LLM loop, hooks, events, tools, dynamic sub-agents.

Project description

power-loop

PyPI Python CI License

English · 中文 · Docs · Examples · Changelog

Loop engineering, not framework adoption. power-loop is an embeddable agent execution kernel: you engineer the agent loop — hooks at every lifecycle point, pluggable storage, sandbox seams, compaction, deterministic workflows — instead of building your app inside a framework. The loop itself is a lightweight, stateless handle over a pluggable store (SQLite by default — zero infrastructure — or PostgreSQL/MySQL by DSN). Out of it you get durable multi-turn sessions, tool calling, sub-agents, crash-resumable multi-agent workflows, durable timers, and process-level sandboxing. No service to run, no graph DSL to learn.

from power_loop import StatefulAgentLoop, create_llm_service_from_env

# The loop is a thin, stateless handle over a store. Default = one SQLite file (zero infra);
# swap dsn= to "postgresql://…/app" or "mysql://…/app" and nothing else changes.
loop = StatefulAgentLoop(llm=create_llm_service_from_env(), dsn="app.db")
sid = await loop.new_session()
await loop.send("Remember my favorite color is teal.", session_id=sid)
print((await loop.send("What's my favorite color?", session_id=sid)).final_text)
# → "Your favorite color is teal."   (durable; survives restarts)

The conversation is already durable, resumable, and tool-capable. And because the loop holds no authoritative state, a fresh process resumes it from nothing but a DSN + the session id:

# Cold start, another process — reconstruct the loop and continue. No state to serialize/carry.
loop = StatefulAgentLoop(llm=create_llm_service_from_env(), dsn="app.db")
print((await loop.send("And my second-favorite?", session_id=sid)).final_text)
pip install 'power-loop[openai]'      # or [anthropic] · add [postgres] / [mysql] for those backends

Stable since 1.0; now 3.x. The public API is frozen under SemVer and machine-enforced by a baseline guard in CI — and the two major bumps since prove the discipline rather than undercut it: 2.0 moved storage to a pluggable async backend, 3.0 made context handling two orthogonal axes. Both were real breaking changes, so both got a major bump. The core has zero runtime dependencies (pure stdlib; verified by a CI job that imports it with nothing else installed) — LLM transports and database drivers are optional extras. Backed by 900+ unit tests, a live-LLM suite, and a 3-backend conformance suite (SQLite/PostgreSQL/MySQL). See Stability and the honest caveats — a young, single-maintainer project says so plainly.


Start here

You are… Go to
🚀 New — show me the 5-minute version Getting Started
🛠️ Learning by building Tutorials — chatbot · tools · human-in-the-loop · multi-agent
🧩 Browsing runnable code 43 examples00_hello_world.py → full chatbot
📚 Looking something up User Guide · API reference
🤔 Deciding if it fits How it compares · Honest scope

Find your way by goal: persist & resume across restarts → Sessions · pick a backend (SQLite/PG/MySQL) → Storage backends · give it tools → Tools / Extending · multi-agent → Workflows · sandbox untrusted code → Sandboxing · monitor → Observability · scale → Scaling · survive crashes → Pending recovery.


Why power-loop — "loop engineering"

Most "agent frameworks" ask you to build your app inside them. power-loop is the opposite: a library you embed. You keep your HTTP layer, your auth, your queues, your RAG, your UI, your deploy. It runs the agent loop — and lets you engineer it.

  • 🪶 Featherweight & zero-dependency. No pydantic, no LangChain, no graph DSL. A compact, pure-stdlib core (~24k lines) whose public surface is essentially one class — and zero runtime dependencies. LLM transports and the Postgres/MySQL drivers are pulled in only by the extra you install.
  • 🗄️ Pluggable storage, zero-infra default. Sessions, timers, sub-agent trees, workflow journals, the shared blackboard — one backend-neutral store written once against a tiny Database/Dialect port. The default is one SQLite file (copy the file, you've copied the state); point a DSN at PostgreSQL or MySQL when you want a real multi-writer server — same code, same conformance suite. Tables are auto-created, or provisioned out-of-band with a printed DDL script (see Storage backends).
  • ♻️ Stateless, resumable loops. A StatefulAgentLoop carries no authoritative state — all of it lives in the store. So a loop is cheap to create and trivially restored from a DSN + a session id (ideal for web handlers, workers, cold starts). It self-caches each session's active window (a rebuildable accelerator that never changes what the model sees) to skip re-reads on hot paths.
  • ⏱️ Durable by default. Crash mid-run and resume(). Agents schedule their own durable timers that survive restarts. Workflows replay finished steps and re-run only the unfinished tail after a process death. The store survives version upgrades (a portable, backend-neutral migration-version table) and can be pruned, VACUUMed, and exported.
  • 🧠 Context engineering, not one fixed strategy. How each finished send is recorded/rendered (representation: full verbatim or a terse per-send projection) and how older history is compacted once over budget (fold strategy: a single LLM summary, or an agentic pass that also writes durable notes) are two orthogonal, config-driven axes — any representation composes with any fold strategy, and both take your own Representation / FoldStrategy implementation. Folds always keep whole sends (never split a tool-call/result pair); recall_send / recall_compacted pull the original detail back from the immutable audit log.
  • 🧩 Composable from one loop to a fleet. Start with send(). Add tools. Spawn sub-agents. Fan out a deterministic workflow (sequence/parallel/foreach/branch). Run each leaf in its own process and DB behind a sandbox. Same primitives all the way up.
  • 🛡️ Isolation seams where it counts. Tool-level sandboxing via a ShellBackend (drop in gVisor/Docker for bash); process-level via a WorkerLauncher (wrap a whole sub-agent worker per leaf). power-loop stays sandbox-agnostic; you choose the policy.
  • 🔬 Built to be observed. Typed events for every stream chunk, tool call, round, and individual LLM call — each seq-ordered + monotonic-clock stamped. Pluggable sinks behind extras: durable JSONL (with replay), Prometheus/StatsD metrics, an OpenTelemetry span tree. Per-run + per-session token accounting and hard per-run budgets.
  • 🔌 Open ecosystem. Provider-agnostic (any OpenAI-compatible endpoint or native Anthropic, by env var). Bring any tool via the ToolRegistry, or connect a Model Context Protocol server with one adapter.
  • Real-tested. A dedicated tests/real/ suite runs the library — workflows, resume, sandboxed subprocess agents, structured output, compaction, a live MCP server — against a real model; the storage layer has a backend-agnostic conformance suite run against SQLite, PostgreSQL, and MySQL.

What you get

Capability One-liner Docs
Stateful sessions Durable multi-turn memory + resume by id, in SQLite/PG/MySQL Sessions
Pluggable backends One store, dsn= picks SQLite (default) / PostgreSQL / MySQL; configurable schema provisioning Storage backends
Stateless / resumable loop Loop holds no state; reconstruct from dsn + session_id; cheap to create Sessions
Tool calling JSON-Schema-validated tools; built-in bash/file/search/skills presets Tools · Extending
Sub-agents Delegate to a child loop via AgentSpec (own prompt/tools/model) Sub-agents
Dynamic workflows JSON DSL (sequence/parallel/foreach/branch) the LLM can author; deterministic engine Workflows
Workflow resume Journals each step; after a crash, replays completed steps and re-runs only the tail Workflows
Process sandboxing Each workflow leaf in its own OS process + own DB; wrap each in gVisor/Docker per leaf Sandboxing
Durable timers Agents schedule their own wake-ups; survive restarts; one-shot or recurring Timers
Context — representation Record/render each finished send verbatim or as a terse per-send projection (derived pl_project_messages); pl_messages stays immutable; recall_send re-expands Projection
Context — fold strategy Compact older history once over budget: LLM summary or agentic (also writes notes); pluggable FoldStrategy; never splits a tool pair; recall_compacted re-expands Compaction
Durability ops Portable migration-version table, retention/prune, VACUUM, export_session/import_session, graceful aclose() Sessions
Observability Typed seq-ordered events → durable JSONL + replay, Prometheus/StatsD metrics, OpenTelemetry spans Observability
MCP tools Surface a Model Context Protocol server's tools as power-loop tools Extending
Hooks & events Veto/observe at every lifecycle point; strongly-typed event payloads Hooks · Events
Structured output output_schema → provider response_format → parsed & validated Structured
Pluggable memory Cross-session recall via a MemoryProvider Protocol Memory
Retry / cancel / budgets Provider-aware retry, a unified cancellation token, hard per-run token caps Retry & Cancel
Stable error codes Every PowerLoopError carries a frozen machine-readable code — branch on exc.code API: error codes
Crash recovery heal_pending / resume / abort_pending for runs killed mid tool-call Pending recovery

Highlights

Pluggable storage — SQLite by default, PostgreSQL/MySQL by DSN

The whole store (sessions, messages, timers, compaction journals, sub-agent trees, the blackboard) is written once against a tiny async Database + Dialect port. Pick the backend with a DSN; the code above it never changes.

from power_loop import StatefulAgentLoop, SchemaPolicy

StatefulAgentLoop(llm=llm, dsn="app.db")                                  # SQLite (zero infra, default)
StatefulAgentLoop(llm=llm, dsn="postgresql://u:p@host/app")               # PostgreSQL  → pip install 'power-loop[postgres]'
StatefulAgentLoop(llm=llm, dsn="mysql://u:p@host/app", table_prefix="pl_")  # MySQL    → pip install 'power-loop[mysql]'

# Schema provisioning is a policy. AUTO_CREATE (default) creates tables if missing; VERIFY only
# checks and, if the schema is absent, raises with the EXACT DDL to run as a privileged user.
StatefulAgentLoop(llm=llm, dsn="postgresql://readonly@host/app", schema=SchemaPolicy.VERIFY)

SQLite is a single-writer file (zero infra, shard across processes). PostgreSQL/MySQL are real multi-writer servers — per-session sequence allocation is correct across processes via a SELECT … FOR UPDATE row lock. The same backend-agnostic conformance suite runs against all three. See Storage backends for the per-backend DDL and provisioning options.

Stateless, resumable loops

A StatefulAgentLoop is a handle, not a session. It owns no conversation state — that all lives in the store — so it's cheap to create and you resume any session by id from a cold process:

# Web handler / worker: build a loop per request, resume the user's session, done.
loop = StatefulAgentLoop(llm=create_llm_service_from_env(), dsn=DSN)
await loop.prewarm(session_id)                       # optional: pre-load the active window
result = await loop.send(user_text, session_id=session_id)

Under the hood the loop keeps a per-session active-window cache — but it caches only the durable projection, validated by a monotonic next_seq token, so it's a pure accelerator: a cold loop with an empty cache produces byte-for-byte the same prompts (proven by a warm-vs-cold conformance test, including the recall/compaction/prompt-edit edge cases).

Context engineering — two orthogonal axes you choose (and can implement yourself)

Long conversations outgrow the window. Most libraries give you one fixed compaction behavior; power-loop (3.0) splits it into two independent, config-driven axes:

  • Representation — how each finished send is recorded & rendered: VerbatimRepresentation (full, byte-identical history) or ProjectedRepresentation (a terse per-send plain-text projection). The original detail always stays in the immutable pl_messages audit log.
  • Fold strategy — how older history is compacted once the rendered prefix crosses the budget: LLMSummaryFold (one summary call) or AgenticFold (a bounded tool loop that also persists durable facts as notes).
from power_loop import (
    StatefulAgentLoop, AgentLoopConfig,
    ProjectedRepresentation, AgenticFold,   # mix & match either axis — or pass your own impl
)

cfg = AgentLoopConfig(
    representation=ProjectedRepresentation(max_chars=300),  # terse projection (or VerbatimRepresentation)
    fold_strategy=AgenticFold(keep_last_sends=4),           # summarize older sends + write notes
)
loop = StatefulAgentLoop(llm=llm, dsn="app.db", config=cfg)

Any representation composes with any fold strategy, and each axis is a small Protocol you can implement yourself. A fold always keeps whole sends (it never splits an atomic tool-call/result pair), and the model can call recall_send(send_index=N) / recall_compacted() to pull the full original detail back from the audit log. (The two classes above are public but provisional — added in 3.0, not yet frozen into STABLE_API; AgentLoopConfig itself is Stable.)

Deterministic multi-agent workflows — that the model can author, and that survive a crash

Sub-agent delegation is model-driven ("go do this"). When you want code-driven, deterministic orchestration — fan out over a list, branch on a result, run a pipeline — describe it as a WorkflowSpec and let the engine interpret it. The only LLM calls are the leaves; sequence/parallel/foreach/branch are plain code.

from power_loop.workflow import create_workflow

spec = {
    "name": "research", "input": "the Japanese tea ceremony",
    "root": {"type": "sequence", "steps": [
        {"type": "agent", "id": "plan",
         "spec": {"name": "planner", "system_prompt": "Break the topic into 3 subtopics."},
         "output_schema": {"name": "Plan", "schema": {"type": "object", "required": ["subtopics"],
            "properties": {"subtopics": {"type": "array", "items": {"type": "string"}}}}}},
        {"type": "foreach", "id": "research", "items_from": "plan.subtopics", "as": "t",
         "parallel": True, "max_concurrency": 3,
         "body": {"type": "agent", "id": "r",
                  "spec": {"name": "researcher", "system_prompt": "Write 2 sentences on {{t}}."},
                  "input": "Subtopic: {{t}}"}},
        {"type": "agent", "id": "write",
         "spec": {"name": "writer", "system_prompt": "Synthesize the notes."},
         "inputs_from": ["research"]},
    ]},
}
result = await create_workflow(spec, parent_loop=loop).run()

Validated on creation (every problem reported at once — ideal for an LLM to repair). Run it detached and the parent agent is woken on completion via a durable timer. Crash halfway through the fan-out? resume_run(loop, parent_sid, run_id) replays the planner + finished researchers from the journal and re-runs only what's left. Register it as a tool and the agent builds and submits workflows itself.

Run untrusted sub-agents in real sandboxes — without sandboxing the parent

The default executor runs leaves in-process. The subprocess executor runs each leaf in its own OS process against its own SQLite file (the one-writer-per-file rule holds trivially), and a WorkerLauncher wraps that process — per leaf, by inspecting its granted tools — in gVisor / Docker / firejail.

from power_loop.workflow import SubprocessExecutor, WorkerBootstrap, create_workflow

ex = SubprocessExecutor(
    bootstrap=WorkerBootstrap(llm_from_env=True, tool_preset="core"),
    launcher=my_gvisor_launcher,   # wraps the worker command per leaf; fail-closed
    timeout_s=120,
)
await create_workflow(spec, parent_loop=loop, executor=ex).run()

Durable, operable storage — the part most "agent libraries" skip

The store is the product, so it's built to run for the long haul:

await store.export_session(sid)                 # full session → a JSON archive (incl. compacted turns)
await store.prune_compacted_messages(sid)       # opt-in retention of folded-out originals
await store.vacuum(); await store.checkpoint()  # reclaim disk (SQLite; no-op where N/A)
async with StatefulAgentLoop(...) as loop:      # graceful aclose(): drain in-flight sends, then close
    ...

It survives upgrades — a portable pl_schema_migrations version table (not a SQLite-only PRAGMA) refuses a newer-than-code DB rather than corrupting it, and works identically on every backend.

Observe everything, export anywhere

from power_loop.contrib.jsonl_sink import attach_jsonl_sink, replay
from power_loop.contrib.metrics_sink import attach_metrics_sink, PrometheusBackend

attach_jsonl_sink(bus, "events.jsonl")        # durable; replay("events.jsonl") later
attach_metrics_sink(bus, PrometheusBackend()) # power-loop[prometheus] · or StatsD, or OpenTelemetry spans

Every event carries a process-wide seq and a monotonic clock, so streams totally-order and reconstruct. Sync subscribers run inline by default; opt into a bounded-queue background dispatcher when a sink might block.

Connect a Model Context Protocol server

from power_loop.contrib.mcp import StdioMCPClient, register_mcp_tools   # power-loop[mcp]

client = await StdioMCPClient("npx", ["-y", "@modelcontextprotocol/server-filesystem", "/data"]).connect()
await register_mcp_tools(registry, client, prefix="fs.")   # MCP tools → power-loop ToolDefinitions

The seam is a tiny MCPToolSource Protocol, so the mcp SDK is optional and any client works.

More: hard token budgets, structured output, crash recovery, memory, the blackboard — see examples/ (43 runnable programs) and the docs.


How it compares

power-loop is a kernel, not a platform — that's the whole trade-off.

  • vs. LangChain / LangGraph / LlamaIndex / CrewAI / AutoGen — those are batteries-included frameworks with large ecosystems (connectors, vector stores, integrations) and heavy dependency trees. power-loop deliberately ships none of that: a compact (~24k-line) pure-stdlib core with zero runtime dependencies, and you bring your own tools (or an MCP server). You get durable sessions across SQLite/PG/MySQL, crash-resumable workflows, and real sandbox seams out of the box; you do not get a bundled RAG stack or 100 connectors.
  • Choose power-loop when you want to embed an agent in an existing app, keep your dependency surface tiny, pick your own database, and care about durability + isolation + a stable contract.
  • Choose a framework when you want batteries included, a big integration catalog, and don't mind the weight.

Honestly: power-loop is behind on ecosystem breadth (integrations, community, age) and ahead on embeddability, durability, storage flexibility, and a machine-guarded stable API. Pick accordingly.


Install & configure

pip install 'power-loop[openai]'      # any OpenAI-compatible endpoint
pip install 'power-loop[anthropic]'   # native Anthropic Messages API
pip install 'power-loop[postgres]'    # PostgreSQL backend (asyncpg)
pip install 'power-loop[mysql]'       # MySQL backend (aiomysql)
pip install 'power-loop[all]'         # transports + postgres + mysql + skills/pdf/observability/mcp

Point it at any OpenAI-compatible endpoint (or POWER_LOOP_PROVIDER=anthropic):

POWER_LOOP_BASE_URL=https://api.openai.com/v1
POWER_LOOP_API_KEY=sk-...
POWER_LOOP_MODEL=gpt-4o-mini

Python 3.10+. See Getting Started. Optional extras: postgres, mysql, skills, pdf, prometheus, statsd, otel, mcp.


Stability & SemVer

Since 1.0, the STABLE API (listed in power_loop.STABLE_API) is under SemVer: a breaking change requires a major bump, enforced by a frozen-baseline test in CI — including the flagship StatefulAgentLoop and the LLM contract needed to construct it. Error .code strings are frozen too. The two majors since (2.0 pluggable async storage, 3.0 orthogonal context axes) were exactly that policy in action — breaking changes earned a major bump, each documented in the Changelog.

Tier Meaning
Stable Backward-compatible within a major version; in power_loop.STABLE_API.
Provisional Re-exported from the top level (e.g. open_store, SchemaPolicy); may change in a future minor.
Internal power_loop.core.*, power_loop.runtime.store.* internals; no compatibility promise.

See the API reference.


Honest scope

power-loop orchestrates; it does not, by itself, isolate. The built-in bash/file tools run in-process and inherit the host environment — convenient for trusted, local use, not a security boundary. For untrusted/model-authored commands, inject a sandbox via the ShellBackend seam (tool-level) or run leaves through SubprocessExecutor + WorkerLauncher (process-level). Keep secrets in your orchestrator. See SECURITY.md.

Single-writer-per-session. Per-session ordering is an in-process asyncio.Lock; it gives no cross-process mutual exclusion. With SQLite, run one writer process per file (shard sessions across files). With PostgreSQL/MySQL, sequence allocation is multi-writer-safe (SELECT … FOR UPDATE), but the pending-state machine still assumes one writer drives a given session at a time (the dispatcher/queue layer above is yours). Concurrent first-boot of a fresh server schema should provision out-of-band (SchemaPolicy.VERIFY). See the scaling guide.

Maturity. A 1.0 tag here is a confidence statement about the API/durability contract — not a claim of years of field-hardening. power-loop is young, primarily a single maintainer, with limited public production track record. The contract is machine-guarded and the project is MIT + forkable; weigh the bus factor for your use.


Project & links

  • Used by: DeepTalk — the agent runtime for a 1-on-1 relationship-IM product's in-conversation agents. (Using it in production? PR a line here.)
  • Develop: pip install -e ".[dev]" · ruff check . · pytest -q --no-real (drop --no-real for the live-LLM suite; set a POWER_LOOP_TEST_PG_DSN / POWER_LOOP_TEST_MYSQL_DSN to run the server-backend conformance suites).
  • Docs · Architecture · Storage backends · Changelog · Contributing · Security · License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

power_loop-3.0.2.tar.gz (315.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

power_loop-3.0.2-py3-none-any.whl (351.7 kB view details)

Uploaded Python 3

File details

Details for the file power_loop-3.0.2.tar.gz.

File metadata

  • Download URL: power_loop-3.0.2.tar.gz
  • Upload date:
  • Size: 315.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for power_loop-3.0.2.tar.gz
Algorithm Hash digest
SHA256 aac114cc5f8f5b9eb6d020696a3f04ab60ce6fdad9bdf9e7c1fd8677e60645d4
MD5 2c16f2d91ebbc8aa9a5b8abbe13c247d
BLAKE2b-256 3c92a40157817e09b89f4bbecc592cc4826c90d032c39dea0694daced2e3f4ea

See more details on using hashes here.

File details

Details for the file power_loop-3.0.2-py3-none-any.whl.

File metadata

  • Download URL: power_loop-3.0.2-py3-none-any.whl
  • Upload date:
  • Size: 351.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for power_loop-3.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 2cd28f417f19622fd2b930ffc081fcaa001cd2f6e3a460d3c0194f4f420895c8
MD5 b95dc88b34208f9cc9f0a7f5012aad39
BLAKE2b-256 66230cadf9bc92ea366f11acad7dd64ba6cfe1acc561787cb60dc315b6980b46

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page