Record computer activity and generate step-by-step instructions
Project description
Systemu
A personal AI workforce you instruct in plain language or by showing it โ and that grows the capabilities it needs to finish the job, under your governance.
Ask a quick question and get an answer in seconds. Hand a whole task to a chat and an AI specialist runs it end-to-end. Or record a task on screen once and replay it forever. However you instruct it, when the agent hits something it lacks mid-run โ a tool that doesn't exist, a skill it wasn't given, a file it can't read โ it doesn't fail and it doesn't fake it. It requests the missing capability, and an always-on Governor grants, denies, or escalates the request by risk. Every action gated, logged, and local. Self-provisioning, made safe.
Three ways to put it to work:
- ๐ฌ Ask โ a quick question in Chat comes back in seconds (plan-first, and honest when an answer is only partial).
- ๐ฃ๏ธ Delegate โ describe a whole task in plain language; an AI specialist runs it end-to-end through the governed pipeline.
- ๐ฌ Demonstrate โ record a task on screen once; Systemu turns it into a reusable workflow you replay in one click.
Tell it or show it โ verbal or visual. Either way, the agent assembles the capabilities it needs as it goes, under your approval.
Why Systemu is different
Most automation is frozen at design time. RPA scripts break when a selector moves; agent frameworks can only use the tools you wired up in advance. But the capability an agent actually needs is usually discovered mid-task โ a tool that doesn't exist yet, a skill it wasn't given, a file it can't read. The system guesses the toolkit up front, and the agent โ the one actually doing the work โ can't ask for more.
Systemu inverts that. Instead of the system pushing a fixed harness to the agent, the running agent pulls the capabilities it lacks at runtime, and an always-on Governor arbitrates every request by risk โ auto-granting the safe, escalating the rest to you. The agent assembles its own harness, just-in-time, under governance. We named the pattern Reverse-Harness and built a benchmark for it.
It starts where RPA starts โ record a task once โ and goes where RPA can't: the agent grows to finish the job. Built for work that has consequences:
- Governed self-provisioning โ forging a tool, attaching an MCP server, reading a secret, spawning sub-agents: each is a request the Governor grants, denies, or escalates. High-risk requests always land as one card in your Inbox with a plain-English summary and a safe default.
- Local-first โ your recordings, workflows, memory, and results live in a vault on your machine. API keys are never typed into the browser.
- Honest by construction โ tool results are verified (a no-output call is a failure, not a phantom success), outcomes report file paths you can open, and "couldn't do it" is never dressed up as done.
The Reverse-Harness pattern
Classic agent harnesses are push: the system decides, at design time,
which tools and permissions an agent gets. Reverse-Harness flips it to
pull โ the running agent proposes the capabilities it needs and a
governance layer arbitrates them live. REQUEST_HARNESS becomes a
first-class loop verb, the inverse of TOOL_CALL: "provision a capability
I lack" vs "use one I have."
PUSH (classic harness) PULL (Reverse-Harness)
design time ยท fixed runtime ยท just-in-time
โโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโ
system picks the tools agent hits a gap mid-task
โ โ
โผ โผ
agent is frozen with REQUEST_HARNESS:
whatever it was handed "provision what I lack"
โ โ
โผ โผ
gap at runtime โ it fails Governor arbitrates by risk
(grant ยท deny ยท escalate)
โ
โผ
capability leased + logged,
revocable โ the run continues
It generalizes capability acquisition from one class (tools, at design time) to six families the Governor arbitrates at runtime:
| Family | The agent requestsโฆ | Default gate |
|---|---|---|
| Tool | a new executable tool (forge), or reuse of an existing one | reuse auto-grants; new code escalates |
| Skill | a procedure (SKILL.md) โ new or reused |
reuse low-risk; new text โ review |
| Access | reading a file / resource / secret | whitelisted read low; write / secret / network escalates |
| Compute | more iterations / think-budget | within ceiling low; over ceiling escalates |
| Sub-agent | a bounded fleet of parallel child agents | depth + budget clamped; beyond โ escalate |
| MCP | attaching a Model Context Protocol server | re-attach low; new server escalates (SSRF-guarded, tool-hash-pinned) |
The arbitration rests on two ideas we think are non-obvious:
- A self-requested capability is more dangerous than a pre-provisioned one โ the agent chose it โ so it is gated more strictly, not less.
- Judgment can only ever downgrade toward safe. When an ambiguous request needs an LLM judge, the judge may deny or escalate โ it can never grant beyond policy or "open a hole." A judge fault fails to escalation, not to grant.
Every grant is leased and logged: minted on grant, written to a per-run
decision-audit ledger with its outcome, and revocable in one click โ
each self-built capability carries the provenance of the run that made it.
We distilled the whole thing into six reusable patterns โ
Pull-Provisioning, the REQUEST_HARNESS verb, Risk-Tiered Arbitration,
Attributed & Revocable Self-Grants, Off-Path Judgment, and a Provenance
Ledger โ and benchmarked it (see Evidence).
What it looks like
You hand a folder of scanned invoices to a chat: "pull the totals into a spreadsheet." The specialist starts โ and finds it has no PDF-table extractor. Instead of failing, it requests one:
REQUEST_HARNESS{ kind: tool โ "extract a table from a PDF" }- New code is HIGH risk, so the Governor escalates: one card lands in
your Inbox โ "forge
pdf_table_extract? [view code] ยท [approve] ยท [deny]." - You approve. Systemu writes the tool, dry-runs it to prove it works, deploys it with an agent-built badge, and the run resumes โ now holding the capability it lacked thirty seconds ago.
- Next time, the tool already exists: no request, no gate, instant.
Notice the gap โ request โ govern โ grow. That's the whole loop.
How Systemu grows itself
Self-provisioning isn't a one-off. Capabilities the agent acquires persist (attributed and revocable), and the system keeps making them better:
- Forge tools on demand โ a missing tool is written (spec โ code), dry-run-validated behind a gate, then deployed as a first-class, reusable tool. You see exactly what the agent built itself, with an agent-built badge, and can revoke it anytime.
- Recalibrate what's inadequate โ when a tool or skill keeps failing, the runtime diagnoses it and either repairs it in place or forks a specialized version โ re-validated before it ships.
- Evolve over time โ an evolution engine reviews real runs and proposes improvements (merge duplicate specialists, upgrade a persona with a discovered skill). Proposals land in your Inbox; nothing auto-applies.
- Remember โ episodic memory captures what each run learned, a curator consolidates skills over time (archive, never delete), and a capability ledger tracks what actually works.
Every one of these is governed: forging, recalibration, and evolution surface as approval cards by default. The agent gets more capable; you stay in control of what it keeps.
Evidence
Reverse-Harness is being validated, not just asserted. Our Capability-Gap Benchmark puts it through tasks that are impossible without acquiring a missing capability, across the six families and multiple frontier models, in three conditions: a frozen-harness baseline (no pull), governed pull (the full Governor), and pull without the LLM judge โ graded by an external oracle, never the system's own verifier (which would be circular).
It targets the load-bearing question of self-provisioning that, to our knowledge, no prior benchmark isolates: does the agent know when it's blocked, and does it request the right capability? โ pull-decision precision/recall, request appropriateness (premature / wasted / unused), governance cost (the deterministic-vs-LLM split), and per-family efficacy.
Two properties hold by construction, independent of any run:
- Bounded safety. Every high-risk request escalates regardless of configuration; a judge fault escalates; a Governor failure can only ever deny or escalate โ never grant. These are verified as explicit safety properties, not hoped-for behavior.
- Cost-disciplined governance. Deterministic policy resolves the easy majority for free; the LLM judge is reserved for genuinely ambiguous cases.
The headline result: across 179 trials over 5 models / 5 vendors, a frozen-harness baseline succeeds on 6% of gap-bearing tasks and governed pull on 61%, recovering ~60% of the baseline's failures at modest cost. The full results โ the recognition rate and request-outcome taxonomy, governance cost, and the bounded-safety verification โ are in the preprint.
๐ Preprint: Reverse-Harness: Design Patterns for Runtime, Agent-Initiated
Capability Provisioning under Governance โ
Rameswaran Mohan, 2026. Preprint, not yet peer-reviewed; licensed
CCย BYย 4.0. Every number
is reproducible from cgb_eval/ and cgb_results/ via
python -m cgb_eval.paper_numbers.
Cite (DOI): 10.5281/zenodo.20816383
@misc{mohan2026reverseharness,
title = {Reverse-Harness: Design Patterns for Runtime, Agent-Initiated
Capability Provisioning under Governance},
author = {Mohan, Rameswaran},
year = {2026},
note = {Preprint},
doi = {10.5281/zenodo.20816383},
url = {https://doi.org/10.5281/zenodo.20816383}
}
Quick start
pip install systemu
In your chosen working directory:
sharing_on init # seeds the starter catalog (41 tools, idempotent)
sharing_on setup # pick your LLM provider + model preset, store keys securely
sharing_on daemon start
sharing_on setup walks you through choosing a provider (OpenRouter, Google,
OpenAI, Anthropic, or a local Ollama) per tier and stores the keys in a local
.env โ entered hidden, never echoed, never typed into a browser. Skip it and
daemon start runs the same flow on first launch.
Open http://localhost:8765. A short setup wizard and guided tour take it from there: confirm your models, say who you are, run a starter task โ then hit Record and teach it something real.
The one-page guide: OPERATOR-SOP.md โ the record โ approve โ run โ results loop, what each approval card means, and a troubleshooting table. New to the vocabulary? docs/glossary.md maps Systemu terms to industry ones.
Docker (Postgres-backed) and enterprise (Redis-scaled) modes:
git clone <this repo> && cd <repo>
python install.py --mode docker-local # or docker-enterprise
What's in the box
- Sharing-On (
sharing_on) โ the capture engine: records screenshots, window switches, file changes, and input while you demonstrate a task, then turns the recording into accurate plain-English instructions. - Systemu runtime โ executes workflows through AI Shadow agents (specialists created per job, with your approval), a curated 41-tool registry that works out of the box, MCP connector support, episodic memory, and an evolution engine that proposes improvements from real runs. A Reverse-Harness Governor arbitrates the capabilities a running agent asks for, writes a per-run decision-audit trail, and โ opt-in โ can fan a decomposed goal out to a bounded fleet of parallel sub-agents.
- Bring your own model โ choose a provider per tier: OpenRouter, Google,
OpenAI, Anthropic (native SDK), or a local Ollama (keyless, on-device).
Presets โ budget / balanced / quality โ set the cost/quality dial in one
keystroke;
sharing_on setupor Settings stores the keys. - The dashboard โ a command center: Home ยท Work ยท Shadows ยท Build ยท Insights ยท Settings, a persistent Needs you + Live rail, and one Decisions Inbox where every approval lands. Quick tasks answer in seconds from Chat; recorded workflows re-run in one click.
๐ More: Getting Started ยท Architecture ยท User Guide ยท Contributing
How it works
You perform a task on your computer
โ
โผ
Sharing-On records: screenshots, window switches,
file changes, clipboard, process events
โ
โผ
Intent extractor (Tier-2 LLM) infers what you
actually wanted โ written to intent.json, not
inferred from the click sequence (v0.6.0)
โ
โผ
Scroll refiner turns the intent + abstracted
steps into a structured Scroll with objectives
โ
โผ
Pre-flight scroll validator (opt-in) checks
satisfiability + intent-vs-tool fit; (v0.4.0 + v0.6.0)
surfaces a side-by-side remediation card
with a proposed_revision when blocked (v0.6.0)
โ
โผ
Activity extractor selects tools and skills
via data-flow reasoning (schemas in headers,
not just keyword name match) (v0.6.0)
โ
โผ
Missing tools forged with intent context โ
dry-run validation gate (Gate 3.5) (v0.5.0)
โ
โผ
Shadow decision picks an existing specialist OR
creates a new one, scoring on semantic intent
match plus skill/tool ID overlap (v0.6.0)
โ
โผ
Supervisor dispatches the Shadow. Intelligent
Supervisor (opt-in) intervenes between
iterations with bounded actions including
RECALIBRATE_TOOL / RECALIBRATE_SKILL when
capabilities are structurally inadequate
โ
โผ
Reverse-Harness Governor arbitrates capability
requests the running Shadow PULLs โ a missing
tool, a dependency, an escalation, or a fan-out
to parallel sub-agents (opt-in). Under the
default risk-tiered gate mode it auto-grants
low-risk requests and escalates the rest to the
Decisions Inbox; on approval the run resumes.
Every iteration's decision is written to a
per-run decision-audit ledger
โ
โผ
Dashboard shows live progress, results,
per-shadow + per-tool metrics, memory, and the
Decisions Inbox for every operator gate
A deeper walkthrough of every stage lives in
ARCHITECTURE.md.
Dashboard
The web dashboard (default http://localhost:8765) is organised as a six-spine command center. The left sidebar has exactly six entries:
| Spine | Route | What it holds |
|---|---|---|
| Home | / |
Overview โ stat cards, the workflow pipeline, and the live activity feed |
| Work | /work |
The workflow-centric view; Scrolls + Activities fold in here |
| Shadows | /shadows |
The Shadow roster (agent personas) and their per-shadow memory |
| Build | /tools |
Tool registry (with an agent-built filter for tools Systemu forged itself); Skills and Evolution proposals fold in here |
| Insights | /insights |
Memory, the capability flywheel, and the event stream (tabbed) |
| Settings | /settings |
LLM tier config, the gate-mode dial, and approval defaults |
Two surfaces are present on every page:
- Right rail โ a persistent panel showing what Needs you (a glance at pending gates) and Live (a feed of in-flight runs). On narrow viewports it collapses to a "Needs you (N)" badge in the header.
- Decisions Inbox (
/inbox) โ the single place every approval gate lands as one unified card: scroll-approval, dependency, tool-forge, evolution, harness-escalation, and recovery gates. Approve executes โ approving a card runs the same action the CLI would (e.g. approving a scroll triggers activity extraction).
Gate modes
Settings exposes a gate-mode dial that controls how the runtime handles approval gates:
| Mode | Behaviour |
|---|---|
| Risk-tiered (default) | The Governor auto-grants low-risk requests and escalates the rest to the Inbox |
| Approve-only | Every gate waits for the operator |
| Bypass | Auto-grants every gate except the safety floor (dependency/recovery gates) โ dev/test only |
A safety floor keeps dependency and recovery gates interactive even
under Bypass unless explicitly disabled. The same dial is available from
the CLI via sharing_on decisions mode.
Legacy URLs still work.
/armyredirects to/shadows;/systemu-chat,/memory,/flywheel, and/notificationsredirect into their merged tabs. The old/workshoproute is gone โ its scroll rebuild is now an in-place dialog on the Scrolls view.
Prerequisites
Resource minimums (verified during the manual smoke run)
| Resource | local |
docker-local |
docker-enterprise |
|---|---|---|---|
| CPU cores | 2 | 2 | 4 |
| Free RAM | 4 GB | 6 GB | 8 GB (Redis + Postgres + workers) |
| Free disk | 2 GB | 4 GB | 6 GB |
| Network | LLM API access | + Postgres | + Redis |
Software
| Requirement | Version | Notes |
|---|---|---|
| Python | 3.10+ (3.12 tested) | Required for all modes |
| pip | latest | pip install --upgrade pip |
| Git | 2.30+ | Required for ./install.sh |
| Docker | 24+ / Desktop 4.x | Required for docker-local and docker-enterprise |
| Node.js + npm | 18+ | Optional โ only for the Chrome capture extension |
OS support
| OS | Native capture | docker-* modes |
|---|---|---|
| Windows 10 / 11 | โ verified | โ verified |
| macOS 13+ | โ ๏ธ partial | โ |
| Ubuntu 22.04+ | โ ๏ธ needs xdotool xclip |
โ |
Linux capture extras:
sudo apt install xdotool xclip # Debian / Ubuntu
sudo dnf install xdotool xclip # Fedora
LLM access
Systemu calls models in three tiers and you choose a provider per tier โ mix and match, or use one everywhere. You need credentials for at least one of:
- OpenRouter (free tier works) โ the default; one key reaches many models
- Google AI Studio (free)
- OpenAI
- Anthropic โ native SDK (install the
anthropicextra) - A local Ollama instance on
:11434โ keyless, on-device
sharing_on setup collects the keys (hidden entry, stored in .env) and the
Settings page lets you switch providers, models, and the budget / balanced /
quality preset anytime.
Install from source (Docker & enterprise modes)
Installed with pip above? You're done โ this section is only for running the
Docker / enterprise stacks or hacking on the code. Full walkthrough lives in
docs/getting-started.md. The headline:
git clone https://github.com/rameswaran-mohan/project-systemu.git
cd project-systemu
./install.sh # Linux/macOS (or install.bat on Windows)
./start.sh # Linux/macOS (or start.bat on Windows)
install.sh asks which deployment mode you want and sets everything up. Three options:
| Mode | What you get | Best for |
|---|---|---|
| local | Native venv. Daemon + worker run as detached subprocesses. SQLite vault + Huey-SQLite broker. | Single-machine dev / personal use. |
| docker-local | docker-compose. Postgres vault + Huey-SQLite broker. One worker container. | Hobbyist self-hosting on one box. |
| docker-enterprise | docker-compose. Postgres vault + Redis broker. N worker containers (scale via WORKER_REPLICAS). |
Production / multi-host. |
The dashboard runs at http://localhost:8765 in every mode.
./stop.sh (or stop.bat) shuts everything down cleanly.
To re-run installer after changing your mind: ./install.sh will detect the existing
install and offer reconfigure / upgrade-deps / quit.
To upgrade an existing install to the latest release: ./update.sh
(or update.bat). It stops the daemon, git pull --ff-onlys, reinstalls deps,
runs alembic migrations, and restarts. Pass --yes / /y for non-interactive
CI / cron usage. Refuses on a dirty working tree.
Non-interactive install (CI / automation)
./install.sh --mode docker-enterprise --non-interactive \
--pg-password=hunter2 --redis-password=hunter3 \
--worker-replicas=4 \
--openrouter-key=sk-... --google-key=AIza...
Record a workflow (optional)
After ./start.sh:
sharing_on record --name "My workflow"
# Press Ctrl+C when done โ Systemu converts the recording into a Scroll
Windows note (v0.7.3): Use Ctrl+C directly in the same terminal where
sharing_on recordis running. Sending SIGINT from another process viakill -INT <pid>(e.g. from Git Bash or a background script) may not deliver the signal to the Python child reliably โ the session may stop without writing its finalend_time, leavingsession.jsonlooking half-complete. Events inevents.dbare still complete and the session is fully usable bysharing_on analyze.
Export a recorded workflow as a portable Agent Skill
Once a recording has been analyzed, one command turns it into a portable Anthropic Agent Skill bundle that any Agent-Skills-compatible runtime (Claude Code, Cursor, etc.) can load:
sharing_on capture export-skill ./captures/<your_session_dir> \
--output ./my-skill
# -> ./my-skill/<kebab-name>/SKILL.md
Validate the bundle with skills-ref validate ./my-skill/<kebab-name>.
Legacy / advanced Docker profiles
The original profiles are still in docker-compose.yml for backwards compatibility:
docker compose up systemu # legacy file backend
docker compose --profile docker-sandbox up systemu-docker # tool sandbox
Migrating from a pre-pivot install
If you already have a JSON-vault deployment from before the holistic-enterprise pivot and want to move to docker-local or docker-enterprise, run the one-shot migration tool after spinning up the new Postgres:
# 1. Start the new stack so Postgres is up + tables created
./install.sh --mode docker-enterprise --skip-pull --pg-password=<your-pg> --redis-password=<your-redis>
docker compose --profile enterprise up -d postgres
alembic upgrade head # creates tables in the new Postgres
# 2. Dry-run โ see what would migrate
python -m systemu.migrations.json_to_db \
--source ./systemu/vault --dry-run
# 3. Run for real
python -m systemu.migrations.json_to_db \
--source ./systemu/vault \
--target "postgresql://systemu:<pg-password>@localhost:5432/systemu"
The migration is idempotent โ re-running it after fixing any errors leaves
already-migrated rows untouched. See systemu/migrations/json_to_db.py for
the source list (scrolls, shadows, tools, skills, activities, evolutions,
chat history).
For Redis topologies beyond the default standalone (TLS, Sentinel, custom CA),
see docs/redis-topologies.md.
Configuration
Every setting lives in your .env file โ copy .env.example
(each variable is documented inline) as a starting point, or let
sharing_on setup and the dashboard Settings page write them for you. The
ones you'll actually touch:
| Variable | Default | What it does |
|---|---|---|
| API key (one of) | โ | OPENROUTER_API_KEY (default, many models), GOOGLE_API_KEY, OPENAI_API_KEY, ANTHROPIC_API_KEY (needs systemu[anthropic]), or OLLAMA_URL (local, keyless). At least one. |
SYSTEMU_MODEL_PRESET |
budget |
Cost/quality dial: budget | balanced | quality. Override any tier with SYSTEMU_TIER{1,2,3}_MODEL; an explicit tier model always wins. |
SYSTEMU_STORAGE |
sqlite (local) |
file | sqlite | postgres โ set by install.py per mode. |
SYSTEMU_DASHBOARD_PORT |
8765 |
Dashboard port. |
SYSTEMU_OUTPUT_DIR |
~/Documents |
Where agent-generated files land. |
SYSTEMU_NON_INTERACTIVE |
false |
Auto-pick the safe default in every approval prompt (dev/CI only). |
SYSTEMU_DELEGATE_USE_PARALLEL |
false |
Opt in to parallel sub-agent fan-out for granted SUBAGENT requests. |
The full set โ per-tier models + providers, queue/Redis, Docker host binds,
the Intelligent-Supervisor budget knobs, pre-flight validators, recalibration
auto-approve, persona dials, and capture intervals โ is documented inline in
.env.example and editable from the Settings page.
Storage Modes
install.py writes SYSTEMU_STORAGE=sqlite to .env for local mode and postgres for docker-local / docker-enterprise. The in-process default when no env is set is file (kept for backward compat with pre-v0.3 installs).
SYSTEMU_STORAGE=sqlite (default for local mode)
- SQLite database at
SYSTEMU_DATABASE_URL, e.g.sqlite:///./data/systemu.db - Durable task queue with crash recovery + orphan requeue
- Dashboard and worker run as separate processes
- Alembic migrations run automatically on first start
- Recommended for single-machine deployments
SYSTEMU_STORAGE=postgres (default for docker-local / docker-enterprise)
- PostgreSQL backend (managed by docker-compose)
- Multi-machine / multi-worker deployments
- Same Alembic migrations as SQLite
SYSTEMU_STORAGE=file (legacy)
- State stored as JSON files in
systemu/vault/ - Zero external dependencies
- Kept for backward compatibility; use the migration tool below to move to SQLite or Postgres
Migrating from file โ SQLite or Postgres:
SYSTEMU_STORAGE=sqlite SYSTEMU_DATABASE_URL=sqlite:///./data/systemu.db \
python -m systemu.migrations.json_to_db --source ./systemu/vault --dry-run
See the Migrating from a pre-pivot install section above for the Postgres path.
Project Structure
project-systemu/
โโโ sharing_on/ โ Capture engine + analyser
โ โโโ collectors/ โ Screen, clipboard, file, window monitors
โ โโโ analyzer/ โ Step detector, narrative generator
โ โ โโโ intent_extractor.py โ v0.6.0 Tier-2 pre-pass that infers
โ โ โ outcome-oriented intent before the
โ โ โ narrative LLM runs (intent.json)
โ โ โโโ prompts/ โ Analyzer prompt library
โ โโโ output/ โ instructions.md renderer
โ โโโ cli.py โ `sharing_on` command entry point
โ
โโโ systemu/ โ Systemu runtime
โ โโโ core/ โ Pydantic models (Shadow, Scroll,
โ โ Activity, Tool, Skill, Objectiveโฆ)
โ โโโ pipelines/ โ Stage 1โ6 transformations
โ โ โโโ scroll_refiner.py โ Stage 2 โ intent + objectives
โ โ โโโ scroll_validator.py โ Pre-flight intent-aware check
โ โ โโโ scroll_remediator.py โ v0.6.0 side-by-side fix card
โ โ โโโ activity_extractor.py โ Stage 3 โ schema-aware extraction
โ โ โโโ skill_validator.py โ v0.6.0 GUI-codification check
โ โ โโโ skill_recalibrator.py โ v0.6.0 re-author instructions_md
โ โ โโโ tool_forge.py โ Spec โ code โ save (Gate 1/2)
โ โ โโโ tool_dry_run.py โ v0.5.0 Gate 3.5 validation
โ โ โโโ tool_recalibrator.py โ v0.5.0 bump-vs-fork pipeline
โ โ โโโ tool_inadequacy_diagnosis.py โ v0.5.0 supervisor diagnosis
โ โ โโโ shadow_decision.py โ Stage 5 โ intent-aware tiebreak
โ โ โโโ refinery.py โ Post-execution memory consolidation
โ โ โโโ evolution_engine.py โ Long-term shadow/skill evolution
โ โ โโโ memory_consolidator.py โ Tiered memory consolidation
โ โ โโโ cross_shadow_patterns.py โ Promotion of recurring lessons
โ โ โโโ workshop_module.py โ Operator-driven scroll/shadow edit
โ โโโ runtime/ โ Shadow ReAct loop + Supervisor
โ โ โโโ shadow_runtime.py โ Per-shadow execute loop
โ โ โโโ supervisor.py โ Activity queue + worker pool
โ โ โโโ execution_mind.py โ Intelligent Supervisor (v0.4.0)
โ โ โโโ execution_snapshot.py โ v0.5.1 true snapshot resume
โ โ โโโ failure_classifier.py โ 10-category failure taxonomy
โ โ โโโ tool_metrics.py / shadow_metrics.py โ per-id telemetry
โ โ โโโ affinity_log.py โ Activity-shadow routing memory
โ โ โโโ inadequacy_tracker.py โ Cross-shadow tool-inadequacy clustering
โ โ โโโ rejection_store.py โ Operator-feedback learning
โ โ โโโ governor.py โ Reverse-Harness Governor (arbitrate + materialise capability PULLs)
โ โ โโโ harness_arbiter.py โ Deterministic GRANT/DENY/ESCALATE policy
โ โ โโโ subagent_fleet.py / subagent_harness.py โ opt-in parallel child fan-out + partial-success collation
โ โ โโโ decision_audit.py โ per-iteration decision ledger (executions/<id>/decision_audit.jsonl)
โ โ โโโ gate_mode_settings.py โ Gate-mode dial (bypass / risk-tiered / approve-only) + floor
โ โ โโโ tool_sandbox.py โ Subprocess / docker / wsl / ssh exec
โ โ โโโ tool_registry.py โ Runtime tool loader
โ โโโ interface/ โ NiceGUI dashboard + REST API
โ โ โโโ pages/ โ Home, Work, Shadows, Build, Insights, Settings, Inbox, Chat
โ โ โโโ command/ โ Shared command layer (Inbox queue, gates, verbs)
โ โ โโโ cli_commands.py โ Systemu CLI groups (scrolls/army/tools/skills/decisions/โฆ)
โ โโโ messaging/ โ Optional Telegram gateway
โ โโโ prompts/ โ Tier-1/2/3 prompt library
โ โโโ queue/ โ In-process / SQLite / Redis priority queues
โ โโโ storage/sqlite/ โ SQLite + Postgres vault (SQLAlchemy)
โ โโโ vault/ โ File-based vault + starter pack
โ โ โโโ tools/ โ Starter tool implementations
โ โ โโโ shadow_army/ โ Starter Shadow configurations
โ โ โโโ skills/ โ Starter SKILL.md files (Anthropic
โ โ Agent Skills Standard compatible)
โ โโโ scheduler/ โ Daemon + recurring jobs
โ โโโ worker.py โ Background worker entry point
โ
โโโ alembic/versions/ โ DB schema migrations (0001โ0010)
โโโ extension/ โ Chrome extension for web-event capture
โโโ docs/ โ Architecture, getting-started, messaging
โโโ tests/ โ pytest suite
โโโ docker-compose.yml
โโโ Dockerfile
โโโ install.py / install.sh / install.bat
โโโ start.sh / start.bat / stop.sh / stop.bat
โโโ .env.example
sharing_on Capture
sharing_on records what you do and produces:
captures/
โโโ my_task_cap_YYYYMMDD_HHMMSS/
โโโ instructions.md โ Step-by-step workflow guide
โโโ session.json โ Session metadata
โโโ events.db โ Raw captured events
โโโ assets/ โ Screenshots embedded in instructions.md
The instructions.md is converted into a Systemu Scroll when you submit the capture to the dashboard.
Privacy: keystrokes are NOT recorded; clipboard auto-redacts secrets; no data leaves your machine until the LLM analysis step.
CLI reference
Everything in the dashboard is also driven from the sharing_on CLI.
Run sharing_on --help (or sharing_on <group> --help) for the full
surface; the headline groups:
| Command | Purpose |
|---|---|
sharing_on record / analyze |
Capture a workflow / re-analyze a recorded session |
sharing_on init |
Seed the working-directory vault from the bundled starter catalog |
sharing_on setup |
Pick the LLM provider + model preset per tier and store keys securely (hidden entry โ .env); auto-runs on first daemon start if unconfigured |
sharing_on daemon start / stop / status |
Run the background daemon + web dashboard |
sharing_on doctor <id> |
Diagnose pending gates/blockers for a scroll/activity/shadow/tool (--apply to auto-fix) |
sharing_on scrolls list / show / refine / approve |
Manage Scrolls (refined SOPs) |
sharing_on army list / show / awaken / execute |
Manage and run Shadows |
sharing_on tools list / forge / dry-run / enable / recalibrate |
Manage the tool registry + its forge gates |
sharing_on skills list / export / deprecate |
Manage Skills (export to a portable Agent Skill) |
sharing_on evolve run / show-pending / apply |
Run and apply the Evolution Engine |
sharing_on decisions list / mode / resolve |
The Decisions Inbox from the terminal; mode sets the gate-mode dial |
sharing_on chat submit / history |
Run a free-text task through the full pipeline |
sharing_on settings show / set |
Inspect / write allow-listed configuration |
sharing_on session ยท capability ยท skill ยท user |
Inspect episodic memory, the capability ledger, bundled skills, and your profile |
Development
# Install with dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Apply database migrations
alembic upgrade head
# Generate a new migration after model changes
alembic revision --autogenerate -m "describe_change"
Contributing
Pull requests are welcome โ from humans and AI agents. See
CONTRIBUTING.md for the contribution flow,
including the explicit guidelines for AI-authored PRs.
- Report bugs / suggest features โ issue tracker
- Security disclosures โ
SECURITY.md - Community expectations โ
CODE_OF_CONDUCT.md - Release notes โ
release-notes/โ one file per version, written at release time
Project status
Pre-1.0 โ current release v0.9.43. It's used daily, but APIs and
behavior can still change. Full per-version history lives in
release-notes/.
The arc so far: a capture engine + three deployment modes โ the Intelligent Supervisor and tool-readiness pipeline (forge โ dry-run โ recalibrate) โ intent-aware extraction โ the Decisions Inbox + gate-mode dial โ the Reverse-Harness Governor: capability-pull arbitration across all six families, scoped leases, a per-run decision-audit ledger, pip-first onboarding, per-tier providers (OpenRouter/Google/OpenAI/Anthropic/Ollama), MCP connectors, and opt-in parallel sub-agent fan-out.
What's next
The next-phase work is open for design. Likely candidates (not yet scheduled):
- Auto-recalibration without operator approval for low-risk skill patterns (telemetry-gated promotion)
- The remaining harness provisioners โ SKILL / ACCESS / COMPUTE (TOOL ships; SUBAGENT fan-out ships opt-in; ACCESS isolation is currently Docker-only / future work)
- Recursive sub-agent decomposition (today's fleet is one level deep)
- Multi-tenant deployment + per-operator vaults
- Hosted catalog of community-contributed tools / skills
If you want to contribute, CONTRIBUTING.md is the contribution flow.
Troubleshooting
The fixes for what new users hit most:
- Windows โ
start.batprints "the system cannot find the drive specified": cosmetic stderr from a stalePATHentry (an old mapped network drive). It doesn't affect startup โ remove the dead entry from PATH. - Linux โ capture records empty events:
pynputneeds X11, but Ubuntu/Fedora default to Wayland. Log in with an Xorg session (the daemon, dashboard, and tools work fine on Wayland โ only capture is affected) andsudo apt install xdotool xclip. - macOS โ capture is empty: grant Accessibility + Screen Recording to your terminal in System Settings โ Privacy & Security, then restart the daemon.
HTTP 401 from OpenRouter: the key is mistyped, revoked, or lacks model access โ generate a fresh one at https://openrouter.ai/keys.- Daemon 500s with
no such column: the DB schema is behind the code โ./update.sh(or re-running the installer) applies the migrations.
More environment-specific fixes โ corporate proxy, Apple Silicon / Rosetta,
Docker host binds, older Python โ are in USER_GUIDE.md. For
anything else, the issue tracker.
License
MIT โ see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file systemu-0.9.43.tar.gz.
File metadata
- Download URL: systemu-0.9.43.tar.gz
- Upload date:
- Size: 2.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f35187138afbbd9dd4f9fb8d71d46bc1c1d93a3872bb45385aaf813332a2b2fd
|
|
| MD5 |
6af31b907ca1d1221a62923c3d78dcc8
|
|
| BLAKE2b-256 |
d5fe4bf22d682d62724d4f3bf11273b3d523df24b14658b426f7b9dcb6eb2d35
|
File details
Details for the file systemu-0.9.43-py3-none-any.whl.
File metadata
- Download URL: systemu-0.9.43-py3-none-any.whl
- Upload date:
- Size: 1.9 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bd96d7cf7f82c66a98aa325287d265a66ce771c280656644212b94cdd9474c02
|
|
| MD5 |
1503a3e8a4192e6f9d3b81683d392e79
|
|
| BLAKE2b-256 |
1f58c400877f3a831a834c1045c270905875c1f2bed558cf8c10a5d1680be06c
|