Skip to main content

Tuple-Space + Yool safe-speed runtime kernel: lazy 1,000,000+ subagent batch_spawn, adaptive lane concurrency, and a dependency-free OpenAI-compatible provider client for real subagents on DeepSeek / MiMo / OpenRouter / local LLMs.

Project description

simplicio-prompt — AI with 1,000,000+ subagents / IA com 1.000.000+ subagentes

simplicio-prompt

Capability-addressing pattern: yool (atomic action) wrapped in tuples (addressable envelopes) over an HAMT (Hash Array Mapped Trie) registry, coordinated through a tuple-space with content-addressable receipts.

This repo is the canonical spec. Vendor it into any project that wants the pattern.

What's new in 1.5

More features, more reach, and a tighter runtime:

  • Now on PyPIpip install simplicio-prompt ships the dependency-free Python kernel and a simplicio-subagents console command, alongside the existing npm package.
  • Real subagents on any model / provider — DeepSeek, MiMo, OpenRouter, or a free local LLM (Ollama / vLLM / LM Studio) all run through the same --provider / --model flags. Default fan-out is now 64 (the self-consistency / exploration sweet spot); --subagents 600 is opt-in for max-breadth parallel audits.
  • First-class observabilityPromptFanout adapters plus per-lane token/cost accounting, circuit-breaker state, and cache stats in every snapshot.
  • Always-on invocation — a UserPromptSubmit hook routes every message through the runtime with no trigger keyword, now including a Gemini CLI target.
  • One-command test suitenpm test runs both the Node CLI/hook tests and the full Python kernel suite (44 tests) in a single pass.

Scope: pick the right tool

This runtime is optimized for always-on multi-step agent operation with subagent fan-out (10²+ subagents) — orchestration, batched audits, parallel exploration. For one-shot deterministic tasks like single-file code edits, single-method additions, or classification, the runtime's tuple-space framing is net-negative on mid-tier models because it shapes output away from the concrete deliverable. Use a task-shaped contract for those (see simplicio-cli's 6-layer contract) and compose: contract = what the task is, this runtime = how the agent operates.

The prompt itself now ships with explicit ONE-SHOT (default) vs BATCH mode selection at the top, so the runtime stays minimal on single-artifact tasks and engages tuple/Yool primitives only when X actually implies orchestration or when YOOL_TUPLE_FULL_RUNTIME=1 is set.

Highlight: 1,000,000+ subagents, zero enumeration

simplicio-prompt scales to 1,000,000+ subagents in a single task without enumerating them, without spawning a million processes, and without melting your provider quota.

It does this with batch_spawn(depth, branching, compression_threshold) — a lazy hierarchical fan-out over a Hilbert-indexed tuple graph. The kernel stores virtual-agent counts and content-addressable receipts instead of a flat list of agents, so the cost of representing the work is logarithmic, not linear.

  • depth=4, branching=32 ⇒ 1,048,576 subagents materialized only when a tuple is actually visited.
  • 2,833.75x faster scale representation than a flat instruction flow (V2 benchmark).
  • 26.93x faster active execution than naive sequential fan-out.
  • compress_token + prune_idle keep inactive subagents as auditable tokens, so a million-subagent task still fits in a small working set.
  • LaneWorkerPool enforces bounded per-lane concurrency (YOOL_TUPLE_LANE_CONCURRENCY=32, YOOL_TUPLE_MAX_LANE_CONCURRENCY=64), so a million-subagent graph never turns into a million concurrent calls.
  • Provider safety stays intact: receipt/input cache, jittered backoff, circuit breakers, and small-task batching apply at the million-subagent scale exactly as they do at one.
  • Observability is first-class: snapshots include per-lane token/cost usage, circuit breaker state, cache stats, and lane worker success/failure metrics.

The output shape stays auditable at any scale:

[Tuple Space Snapshot]
[Active Agents/Subagents]      ← materialized, small
[Total Agents/Subagents]       ← virtual, up to 1,000,000+
[Proximo Yool a executar]
[Resultado parcial]

See prompts/agent-runtime-execution-prompt.md and kernel/yool_tuple_kernel.py for the canonical batch_spawn contract.

Real subagents — any model, any provider

Point simplicio-prompt at any OpenAI-compatible provider and one task fans out across 64 real subagents by default — actual LLM calls, not placeholders, through the same safe-speed runtime. The provider doesn't matter: DeepSeek, MiMo, OpenRouter (any model it gateways), a free local LLM (Ollama / vLLM / LM Studio), or any custom endpoint all work through the same --provider / --model flags.

The default of 64 is the sweet spot for self-consistency / exploration (research shows diminishing returns above ~64 samples). Opt in to --subagents 600 (or higher) when the task is truly max-breadth: large parallel audits, item-parallel work where N items ≥ 200, or running the live-bench scenario below.

Verified live on OpenRouter (deepseek/deepseek-chat) with the max-breadth setting: 600/600 subagents, 0 failures, ~103s, ≈$0.045 at illustrative pricing on a single task. Change --subagents to run any count.

batch_spawn represents subagents virtually. To make them real, the kernel ships a dependency-free, OpenAI-compatible provider client and a fan-out runtime that drains N materialized subagent tuples through LaneWorkerPool — so every real call inherits the same guardrails (bounded per-lane concurrency, the receipt/input cache that de-duplicates identical prompts, jittered backoff, and a provider circuit breaker).

  • kernel/providers.pyLLMProvider + presets for deepseek, mimo, local/ollama, and a generic OpenAI-compatible config, with per-call token usage and cost accounting.
  • kernel/subagent_runtime.pySubagentRuntime fans a task across N real subagents and aggregates results, tokens, and cost.

Run real subagents on DeepSeek. Default fan-out is 64; opt in to --subagents 600 for max-breadth runs.

export DEEPSEEK_API_KEY=sk-...

# default 64 subagents
python kernel/subagent_runtime.py --provider deepseek \
  --task "Brainstorm edge cases and tests for a distributed rate limiter"

# opt-in: max-breadth fan-out for large parallel audits
python kernel/subagent_runtime.py --provider deepseek --subagents 600 \
  --task "Audit these 600 functions in parallel"

# offline cost projection / demo — no key, no network:
python kernel/subagent_runtime.py --provider deepseek --subagents 600 \
  --task "..." --dry-run

# other providers (default 64 subagents)
python kernel/subagent_runtime.py --provider mimo  --task "..."
python kernel/subagent_runtime.py --provider local --subagents 50 --task "..."   # Ollama

Or programmatically:

from kernel.providers import resolve_provider_config, LLMProvider
from kernel.subagent_runtime import SubagentRuntime

provider = LLMProvider(resolve_provider_config("deepseek"))
# default fan-out is 64; bump to 600 for max-breadth parallel audits
report = SubagentRuntime(provider).run("Audit this module", subagents=64)
print(report.format_summary())   # completed/failed, tokens, total cost in USD

Provider configuration

Preset Default base URL API key env Notes
deepseek https://api.deepseek.com/v1 DEEPSEEK_API_KEY cheap cloud, OpenAI-compatible
mimo https://api.mimo.ai/v1 MIMO_API_KEY set MIMO_BASE_URL to your endpoint
openrouter https://openrouter.ai/api/v1 OPENROUTER_API_KEY OpenAI-compatible gateway; set --model (e.g. deepseek/deepseek-chat)
local / ollama http://localhost:11434/v1 none free/offline (Ollama, vLLM, LM Studio)
(any other id) SIMPLICIO_LLM_BASE_URL SIMPLICIO_LLM_API_KEY generic OpenAI-compatible

Per-provider env overrides (example for DeepSeek): DEEPSEEK_BASE_URL, DEEPSEEK_MODEL, DEEPSEEK_PROMPT_COST_PER_MTOK, DEEPSEEK_COMPLETION_COST_PER_MTOK. Cost defaults are illustrative — set the *_COST_PER_MTOK vars to your live contract pricing (e.g. 0.003) so the reported cost_usd matches your bill.

Acknowledgement

Special thanks to Jesse Daniel Brown, PhD, my mentor, a California, USA native and author of 100+ scientific articles. His humanitarian and educational perspective on programming, AI, and scientific work helped reinforce the mission behind this repository: practical agent systems that increase human capability through safer, more auditable automation.

V2 safe-speed infographics

English

YOOL V2 Safe-Speed Runtime infographic in English

Portuguese Brazil

YOOL V2 Safe-Speed Runtime infographic in Portuguese Brazil

Infographic Explanation

The infographics compare a loose prompt flow against the simplicio-prompt V2 safe-speed runtime. The left side shows the old failure modes: flat agent lists, sequential work, repeated provider calls, no cache, fixed concurrency, retry storms, large LLM context, and weak audit trails.

The right side shows the V2 path: tuple-space routing, lazy batch_spawn, adaptive LaneWorkerPool, receipt/input cache, small-task batching, circuit breakers, backoff with jitter, context compression, local yool routing, and speculative execution only for idempotent work. The practical result is faster delivery through avoided repeat work and safer provider behavior, not through unbounded calls.

Measured V2 benchmark highlights:

  • Scale representation: 2,833.75x faster than a normal instruction flow.
  • Active execution: 26.93x faster than normal sequential execution.
  • Cache: 4x fewer provider calls, a 75% reduction.
  • Batching: 32x fewer small-task calls, a 96.88% reduction.
  • Circuit breaker: 64x fewer failure attempts, a 98.44% reduction.
  • Token economy: 76.32% estimated savings through context compression.

Quick read

  • YOOL_TUPLE_HAMT.md - full spec with diagrams, algorithms, examples, guardrails.
  • kernel/yool_tuple_kernel.py - reference Python kernel with lazy batch_spawn, compress_token, hookwall, indexed tuple-space scans, and lane worker fan-out.
  • prompts/agent-runtime-execution-prompt.md - ready prompt for Claude, Codex, Hermes, and other coding agents.
  • examples/ - runnable minimal implementations (Python, Node).
  • guardrails/ - CPU throttle + disk GC reference implementations.
  • adopters.md - projects that vendor this spec.

Install via npm

The repo ships as an npm package and as a multi-IDE plugin. Use it without cloning:

# print the full prompt
npx simplicio-prompt

# print only the `## Prompt` body (no surrounding markdown)
npx simplicio-prompt --raw

# install as a plugin for one (or many) coding agents
npx simplicio-prompt --target claude-code     # → CLAUDE.md
npx simplicio-prompt --target codex           # → AGENTS.md
npx simplicio-prompt --target hermes          # → AGENTS.md
npx simplicio-prompt --target opencode        # → AGENTS.md  (alias: openclaw)
npx simplicio-prompt --target cursor          # → .cursor/rules/*.mdc + .cursorrules
npx simplicio-prompt --target copilot         # → .github/copilot-instructions.md
npx simplicio-prompt --target cline           # → .clinerules/simplicio-prompt.md
npx simplicio-prompt --target aider           # → CONVENTIONS.md
npx simplicio-prompt --target gemini          # → GEMINI.md
npx simplicio-prompt --install-all            # → every target above

# inspect / discover
npx simplicio-prompt --list-targets

Or add it as a dependency and consume it programmatically:

npm install simplicio-prompt
import {
  getPrompt,
  getPromptSection,
  getPromptPath,
  getTargets,
  findTarget,
} from "simplicio-prompt";

const fullMarkdown = getPrompt();        // entire prompt file
const promptOnly   = getPromptSection(); // just the `## Prompt` body
const filePath     = getPromptPath();    // absolute path on disk
const targets      = getTargets();       // multi-IDE plugin target registry
const cursor       = findTarget("cursor");

Every install wraps the prompt in <!-- simplicio-prompt:start --> / <!-- simplicio-prompt:end --> markers so reinstalling updates the block in place instead of duplicating it. The Cursor .mdc rule and any new directory (.cursor/rules/, .github/, .clinerules/) are created automatically.

Install via PyPI (Python)

The reference kernel ships as a dependency-free Python package, so you can run real subagents on any OpenAI-compatible provider without cloning the repo:

pip install simplicio-prompt
from kernel.providers import resolve_provider_config, LLMProvider
from kernel.subagent_runtime import SubagentRuntime
from kernel.yool_tuple_kernel import build_default_space

# 1,000,000+ subagents represented lazily, materialized only when visited
space, root = build_default_space()
receipt = space.batch_spawn(root, "codex_worker", depth=4, branching=32)
print(receipt.virtual_agents)  # 1048576

# real subagents on any provider (DeepSeek / MiMo / OpenRouter / local)
provider = LLMProvider(resolve_provider_config("deepseek"))
# default fan-out is 64; bump to 600 for max-breadth parallel audits
report = SubagentRuntime(provider).run("audit this module", subagents=64)
print(report.format_summary())  # completed/failed, tokens, total cost in USD

The install also exposes a console entry point for offline cost projection and live fan-out:

# default fan-out (64 subagents) — offline cost projection, no API key
simplicio-subagents --provider deepseek --task "..." --dry-run

# max-breadth fan-out (opt-in) for large parallel audits
simplicio-subagents --provider deepseek --subagents 600 --task "..." --dry-run

Multi-IDE plugin matrix

Target id IDE / agent Files written
claude-code Anthropic Claude Code CLAUDE.md
codex OpenAI Codex CLI AGENTS.md
hermes Nous Research Hermes AGENTS.md
opencode (alias openclaw) OpenCode / OpenClaw AGENTS.md
cursor Cursor IDE .cursor/rules/simplicio-prompt.mdc, .cursorrules
copilot (alias github-copilot) GitHub Copilot .github/copilot-instructions.md
cline Cline (VS Code) .clinerules/simplicio-prompt.md
aider Aider CONVENTIONS.md
gemini (alias gemini-cli) Google Gemini CLI GEMINI.md

Auto-invocation per agent

The runtime is meant to be always-on — every user message is treated as the task and routed through the runtime, with no trigger keyword. How that invocation is wired depends on the agent:

Agent Invocation mechanism Always-on?
Claude Code UserPromptSubmit hook (plugins/claude-code/hooks/) injects the runtime contract on every prompt; plus a simplicio-runtime skill and slash commands yes, programmatic
Codex / Hermes / OpenCode AGENTS.md loaded as standing instructions yes (context file)
Cursor .cursor/rules/*.mdc with alwaysApply: true yes (always-apply rule)
GitHub Copilot .github/copilot-instructions.md yes (context file)
Cline .clinerules/simplicio-prompt.md yes (context file)
Aider CONVENTIONS.md yes (context file)
Gemini CLI GEMINI.md loaded every turn yes (context file)

A true prompt-submit hook (inject-on-every-prompt with programmatic stand-down) exists only in Claude Code. For the other agents, the always-applied context file is the invocation — it is read on every turn, so the contract is always present. All of them honor in-message stand-down phrases ("stop", "cancel", "exit runtime", "ignore simplicio").

Claude Code plugin

plugins/claude-code/ is a full Claude Code plugin: drop it into ~/.claude/plugins/simplicio-prompt/ (or your project's .claude/plugins/) to get an always-on hook, three slash commands, and a runtime skill:

  • UserPromptSubmit hook (hooks/hooks.jsonhooks/user-prompt-submit.mjs) — injects the runtime contract (hooks/runtime-directive.md) on every prompt, unless the prompt is an explicit stand-down. This is the always-on invocation.
  • /simplicio <task> — run the next task through the Tuple-Space + Yool runtime.
  • /simplicio-install <target> — install the runtime contract into the current repo (claude-code, codex, cursor, copilot, gemini, all, …).
  • /simplicio-status on|off|<field>:on — toggle the opt-in status output.
  • simplicio-runtime skill — auto-activates in repos that vendor the spec.

Legacy --install

The original single-file installer still works for backwards compatibility:

npx simplicio-prompt --install CLAUDE.md
npx simplicio-prompt --install AGENTS.md
npx simplicio-prompt --install .cursorrules

Integration adapter

For orchestration repos that want the runtime without copying tuple boilerplate, use the Python adapter:

from examples.python.prompt_fanout import PromptFanout

fanout = PromptFanout(repo="my-service", authority="simplicio-sprint")
root, receipt = fanout.spawn_task(
    "review checkout edge cases",
    mapper_context={"target": "src/checkout.ts"},
    depth=2,
    branching=8,
)
fanout.record_tokens("analysis", prompt_tokens=1200, completion_tokens=300, cost_usd=0.02)
print(receipt.virtual_agents)
print(fanout.snapshot()["token_usage"])

simplicio-dev-cli can use the same adapter for internal verification reasoning when it already has structured mapper context.

How to use the prompt

Use simplicio-prompt as a canonical execution prompt for coding agents such as Claude, Codex, Hermes, Cursor, Cline, or any assistant that can read repository instructions.

  1. Run npx simplicio-prompt --install CLAUDE.md (or paste the ## Prompt section from prompts/agent-runtime-execution-prompt.md into AGENTS.md, CLAUDE.md, .cursorrules, or a custom system prompt).
  2. In the target repository, just ask for work in your own words. You do not need to start the message with Implement — any user input (a sentence, a bug description, a code snippet, a one-word request) is treated as the task X and routed through the same runtime. The only opt-outs are explicit stand-down phrases like "stop", "cancel", "exit runtime".
  3. The agent will read the canonical files listed in the prompt, decompose the task into a Hilbert-indexed tuple graph, create a root tuple, route active work through tuple-space primitives, and use LaneWorkerPool plus the V2 safe-speed controls.
  4. Status output is opt-in (default: silent). Enable with YOOL_TUPLE_STATUS=true (or status_output=true runtime flag). When on, the agent returns this shape:
[Tuple Space Snapshot]
[Active Agents/Subagents]
[Total Agents/Subagents]
[Next Yool to Execute]
[Partial Result]

Per-field toggles (default false): YOOL_TUPLE_STATUS_SNAPSHOT, YOOL_TUPLE_STATUS_ACTIVE, YOOL_TUPLE_STATUS_TOTAL, YOOL_TUPLE_STATUS_NEXT, YOOL_TUPLE_STATUS_PARTIAL.

For high-throughput local runs, set the runtime environment variables before starting the agent or scripts:

$env:YOOL_TUPLE_LANE_CONCURRENCY="32"
$env:YOOL_TUPLE_MAX_LANE_CONCURRENCY="64"
$env:YOOL_TUPLE_CPU_QUOTA_PCT="95"
$env:YOOL_TUPLE_QUEUE_MAXSIZE="8192"
$env:YOOL_TUPLE_COMPRESSION_THRESHOLD="1024"
$env:YOOL_TUPLE_CACHE_MAX_ENTRIES="16384"
$env:YOOL_TUPLE_CACHE_TTL_S="3600"
$env:YOOL_TUPLE_API_MAX_RETRIES="3"
$env:YOOL_TUPLE_API_BACKOFF_BASE_MS="100"
$env:YOOL_TUPLE_API_BACKOFF_MAX_MS="5000"
$env:YOOL_TUPLE_CIRCUIT_FAILURE_THRESHOLD="5"
$env:YOOL_TUPLE_CIRCUIT_COOLDOWN_S="30"
$env:YOOL_TUPLE_BATCH_SMALL_TASK_SIZE="32"
$env:YOOL_TUPLE_CONTEXT_COMPRESSION_CHARS="6000"

Run the reference kernel and tests:

python kernel/yool_tuple_kernel.py
python -m unittest discover -s tests -p "test_*.py"

V2 benchmark report

The V2 report is the main evidence for the safe-speed runtime. Read it before adopting the prompt in another project:

What the V2 report shows:

  • 2,833.75x faster scale representation than normal instruction flow.
  • 26.93x faster active execution than normal sequential execution.
  • 4x fewer repeated provider calls through receipt/input cache.
  • 32x fewer small-task calls through batching.
  • 64x fewer provider failure attempts through circuit breakers.
  • 76.32% estimated token savings through context compression.

The key point: V2 speeds up by avoiding repeated work and controlling provider pressure. It does not depend on unsafe infinite calls, unbounded concurrency, or retry storms.

High-throughput runtime defaults

The reference kernel is tuned for speed while keeping host guardrails explicit:

Env var Default Purpose
YOOL_TUPLE_LANE_CONCURRENCY / YOOL_LANE_CONCURRENCY 32 Preferred workers per lane.
YOOL_TUPLE_MAX_LANE_CONCURRENCY / YOOL_MAX_LANE_CONCURRENCY 64 Ceiling for workers per lane.
YOOL_TUPLE_CPU_QUOTA_PCT / YOOL_CPU_QUOTA_PCT 95 Default per-yool CPU budget.
YOOL_TUPLE_QUEUE_MAXSIZE / YOOL_QUEUE_MAXSIZE 8192 Lane queue scan cap.
YOOL_TUPLE_COMPRESSION_THRESHOLD / YOOL_COMPRESSION_THRESHOLD 1024 Active materialized agents before pruning.
YOOL_TUPLE_CACHE_MAX_ENTRIES / YOOL_CACHE_MAX_ENTRIES 16384 Receipt/input-hash cache size.
YOOL_TUPLE_CACHE_TTL_S / YOOL_CACHE_TTL_S 3600 Cache TTL in seconds.
YOOL_TUPLE_API_MAX_RETRIES / YOOL_API_MAX_RETRIES 3 Retry budget for transient API/LLM failures.
YOOL_TUPLE_API_BACKOFF_BASE_MS / YOOL_API_BACKOFF_BASE_MS 100 Initial jittered backoff delay.
YOOL_TUPLE_API_BACKOFF_MAX_MS / YOOL_API_BACKOFF_MAX_MS 5000 Backoff ceiling.
YOOL_TUPLE_CIRCUIT_FAILURE_THRESHOLD / YOOL_CIRCUIT_FAILURE_THRESHOLD 5 Failures before opening provider breaker.
YOOL_TUPLE_CIRCUIT_COOLDOWN_S / YOOL_CIRCUIT_COOLDOWN_S 30 Provider cooldown after breaker opens.
YOOL_TUPLE_BATCH_SMALL_TASK_SIZE / YOOL_BATCH_SMALL_TASK_SIZE 32 Default small-task batch size.
YOOL_TUPLE_CONTEXT_COMPRESSION_CHARS / YOOL_CONTEXT_COMPRESSION_CHARS 6000 Large LLM context compression threshold.

Safe speedups now live in the kernel, not only in the prompt: receipt/input cache, adaptive lane concurrency, jittered backoff, provider circuit breakers, small-task batching, prompt/context compression, local yool routing, and speculative execution only for tuples marked idempotent=True.

Run the reference kernel and tests:

python kernel/yool_tuple_kernel.py
python -m unittest discover -s tests -p "test_*.py"

Benchmark reports:

Why a separate repo

The pattern is cross-project. SendSprint, llm-project-mapper, future agents - all consume the same spec. One source of truth, vendored on demand.

License

MIT © Wesley Simplicio. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

simplicio_prompt-1.6.0.tar.gz (42.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

simplicio_prompt-1.6.0-py3-none-any.whl (31.1 kB view details)

Uploaded Python 3

File details

Details for the file simplicio_prompt-1.6.0.tar.gz.

File metadata

  • Download URL: simplicio_prompt-1.6.0.tar.gz
  • Upload date:
  • Size: 42.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for simplicio_prompt-1.6.0.tar.gz
Algorithm Hash digest
SHA256 6a3b197a78c32744503a41b2c46d06b1bf190583494aef7c978e34522e570de9
MD5 feab7891de9adf940243a7d8b7c62117
BLAKE2b-256 2b302b29da333709b3971491de5220963d57741b2fb3b63020d69344de375c25

See more details on using hashes here.

File details

Details for the file simplicio_prompt-1.6.0-py3-none-any.whl.

File metadata

File hashes

Hashes for simplicio_prompt-1.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8eb950f3dcfbd6f7c9f3a611f77bc22bf586f7ff1cdc082bc3ab6708749fb643
MD5 ecd1e92e8cc590389382267f9e4b79ba
BLAKE2b-256 1692b375c113e8823be3c14cc0a6c8d6abf56cdca0e88b2a4e1e0c19e8cafebd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page