Skip to main content

Record computer activity and generate step-by-step instructions

Project description

Systemu

Systemu

A personal AI workforce you instruct in plain language or by showing it โ€” and that grows the capabilities it needs to finish the job, under your governance.

Ask a quick question and get an answer in seconds. Hand a whole task to a chat and an AI specialist runs it end-to-end. Or record a task on screen once and replay it forever. However you instruct it, when the agent hits something it lacks mid-run โ€” a tool that doesn't exist, a skill it wasn't given, a file it can't read โ€” it doesn't fail and it doesn't fake it. It requests the missing capability, and an always-on Governor grants, denies, or escalates the request by risk. Every action gated, logged, and local. Self-provisioning, made safe.

License: MIT Python 3.10+ DOI

Three ways to put it to work:

  • ๐Ÿ’ฌ Ask โ€” a quick question in Chat comes back in seconds (plan-first, and honest when an answer is only partial).
  • ๐Ÿ—ฃ๏ธ Delegate โ€” describe a whole task in plain language; an AI specialist runs it end-to-end through the governed pipeline.
  • ๐ŸŽฌ Demonstrate โ€” record a task on screen once; Systemu turns it into a reusable workflow you replay in one click.

Tell it or show it โ€” verbal or visual. Either way, the agent assembles the capabilities it needs as it goes, under your approval.

Why Systemu is different

Most automation is frozen at design time. RPA scripts break when a selector moves; agent frameworks can only use the tools you wired up in advance. But the capability an agent actually needs is usually discovered mid-task โ€” a tool that doesn't exist yet, a skill it wasn't given, a file it can't read. The system guesses the toolkit up front, and the agent โ€” the one actually doing the work โ€” can't ask for more.

Systemu inverts that. Instead of the system pushing a fixed harness to the agent, the running agent pulls the capabilities it lacks at runtime, and an always-on Governor arbitrates every request by risk โ€” auto-granting the safe, escalating the rest to you. The agent assembles its own harness, just-in-time, under governance. We named the pattern Reverse-Harness and built a benchmark for it.

It starts where RPA starts โ€” record a task once โ€” and goes where RPA can't: the agent grows to finish the job. Built for work that has consequences:

  • Governed self-provisioning โ€” forging a tool, attaching an MCP server, reading a secret, spawning sub-agents: each is a request the Governor grants, denies, or escalates. High-risk requests always land as one card in your Inbox with a plain-English summary and a safe default.
  • Local-first โ€” your recordings, workflows, memory, and results live in a vault on your machine. API keys are never typed into the browser.
  • Honest by construction โ€” tool results are verified (a no-output call is a failure, not a phantom success), outcomes report file paths you can open, and "couldn't do it" is never dressed up as done.

The Reverse-Harness pattern

The Reverse-Harness loop: a running agent hits a capability gap, issues a REQUEST_HARNESS pull, the Governor arbitrates by risk (low auto-grants, medium goes to an off-path judge, high escalates to the operator Inbox), the grant is materialised across six families, leased and logged, and the run resumes.

Classic agent harnesses are push: the system decides, at design time, which tools and permissions an agent gets. Reverse-Harness flips it to pull โ€” the running agent proposes the capabilities it needs and a governance layer arbitrates them live. REQUEST_HARNESS becomes a first-class loop verb, the inverse of TOOL_CALL: "provision a capability I lack" vs "use one I have."

   PUSH  (classic harness)            PULL  (Reverse-Harness)
   design time ยท fixed                runtime ยท just-in-time
   โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€            โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   system picks the tools             agent hits a gap mid-task
            โ”‚                                    โ”‚
            โ–ผ                                    โ–ผ
   agent is frozen with              REQUEST_HARNESS:
   whatever it was handed            "provision what I lack"
            โ”‚                                    โ”‚
            โ–ผ                                    โ–ผ
   gap at runtime โ†’ it fails         Governor arbitrates by risk
                                     (grant ยท deny ยท escalate)
                                                 โ”‚
                                                 โ–ผ
                                     capability leased + logged,
                                     revocable โ†’ the run continues

It generalizes capability acquisition from one class (tools, at design time) to six families the Governor arbitrates at runtime:

Family The agent requestsโ€ฆ Default gate
Tool a new executable tool (forge), or reuse of an existing one reuse auto-grants; new code escalates
Skill a procedure (SKILL.md) โ€” new or reused reuse low-risk; new text โ†’ review
Access reading a file / resource / secret whitelisted read low; write / secret / network escalates
Compute more iterations / think-budget within ceiling low; over ceiling escalates
Sub-agent a bounded fleet of parallel child agents depth + budget clamped; beyond โ†’ escalate
MCP attaching a Model Context Protocol server re-attach low; new server escalates (SSRF-guarded, tool-hash-pinned)

The arbitration rests on two ideas we think are non-obvious:

  1. A self-requested capability is more dangerous than a pre-provisioned one โ€” the agent chose it โ€” so it is gated more strictly, not less.
  2. Judgment can only ever downgrade toward safe. When an ambiguous request needs an LLM judge, the judge may deny or escalate โ€” it can never grant beyond policy or "open a hole." A judge fault fails to escalation, not to grant.

Every grant is leased and logged: minted on grant, written to a per-run decision-audit ledger with its outcome, and revocable in one click โ€” each self-built capability carries the provenance of the run that made it. We distilled the whole thing into six reusable patterns โ€” Pull-Provisioning, the REQUEST_HARNESS verb, Risk-Tiered Arbitration, Attributed & Revocable Self-Grants, Off-Path Judgment, and a Provenance Ledger โ€” and benchmarked it (see Evidence).

What it looks like

You hand a folder of scanned invoices to a chat: "pull the totals into a spreadsheet." The specialist starts โ€” and finds it has no PDF-table extractor. Instead of failing, it requests one:

  1. REQUEST_HARNESS{ kind: tool โ€” "extract a table from a PDF" }
  2. New code is HIGH risk, so the Governor escalates: one card lands in your Inbox โ€” "forge pdf_table_extract? [view code] ยท [approve] ยท [deny]."
  3. You approve. Systemu writes the tool, dry-runs it to prove it works, deploys it with an agent-built badge, and the run resumes โ€” now holding the capability it lacked thirty seconds ago.
  4. Next time, the tool already exists: no request, no gate, instant.

Notice the gap โ†’ request โ†’ govern โ†’ grow. That's the whole loop.

How Systemu grows itself

Self-provisioning isn't a one-off. Capabilities the agent acquires persist (attributed and revocable), and the system keeps making them better:

  • Forge tools on demand โ€” a missing tool is written (spec โ†’ code), dry-run-validated behind a gate, then deployed as a first-class, reusable tool. You see exactly what the agent built itself, with an agent-built badge, and can revoke it anytime.
  • Recalibrate what's inadequate โ€” when a tool or skill keeps failing, the runtime diagnoses it and either repairs it in place or forks a specialized version โ€” re-validated before it ships.
  • Evolve over time โ€” an evolution engine reviews real runs and proposes improvements (merge duplicate specialists, upgrade a persona with a discovered skill). Proposals land in your Inbox; nothing auto-applies.
  • Remember โ€” episodic memory captures what each run learned, a curator consolidates skills over time (archive, never delete), and a capability ledger tracks what actually works.

Every one of these is governed: forging, recalibration, and evolution surface as approval cards by default. The agent gets more capable; you stay in control of what it keeps.

Evidence

Reverse-Harness is being validated, not just asserted. Our Capability-Gap Benchmark puts it through tasks that are impossible without acquiring a missing capability, across the six families and multiple frontier models, in three conditions: a frozen-harness baseline (no pull), governed pull (the full Governor), and pull without the LLM judge โ€” graded by an external oracle, never the system's own verifier (which would be circular).

It targets the load-bearing question of self-provisioning that, to our knowledge, no prior benchmark isolates: does the agent know when it's blocked, and does it request the right capability? โ€” pull-decision precision/recall, request appropriateness (premature / wasted / unused), governance cost (the deterministic-vs-LLM split), and per-family efficacy.

Two properties hold by construction, independent of any run:

  • Bounded safety. Every high-risk request escalates regardless of configuration; a judge fault escalates; a Governor failure can only ever deny or escalate โ€” never grant. These are verified as explicit safety properties, not hoped-for behavior.
  • Cost-disciplined governance. Deterministic policy resolves the easy majority for free; the LLM judge is reserved for genuinely ambiguous cases.

The headline result: across 179 trials over 5 models / 5 vendors, a frozen-harness baseline succeeds on 6% of gap-bearing tasks and governed pull on 61%, recovering ~60% of the baseline's failures at modest cost. The full results โ€” the recognition rate and request-outcome taxonomy, governance cost, and the bounded-safety verification โ€” are in the preprint.

๐Ÿ“„ Preprint: Reverse-Harness: Design Patterns for Runtime, Agent-Initiated Capability Provisioning under Governance โ€” Rameswaran Mohan, 2026. Preprint, not yet peer-reviewed; licensed CCย BYย 4.0. Every number is reproducible from cgb_eval/ and cgb_results/ via python -m cgb_eval.paper_numbers.

Cite (DOI): 10.5281/zenodo.20816383

@misc{mohan2026reverseharness,
  title  = {Reverse-Harness: Design Patterns for Runtime, Agent-Initiated
            Capability Provisioning under Governance},
  author = {Mohan, Rameswaran},
  year   = {2026},
  note   = {Preprint},
  doi    = {10.5281/zenodo.20816383},
  url    = {https://doi.org/10.5281/zenodo.20816383}
}

Quick start

pip install systemu

In your chosen working directory:

sharing_on init           # seeds the starter catalog (41 tools, idempotent)
sharing_on setup          # pick your LLM provider + model preset, store keys securely
sharing_on daemon start

sharing_on setup walks you through choosing a provider (OpenRouter, Google, OpenAI, Anthropic, or a local Ollama) per tier and stores the keys in a local .env โ€” entered hidden, never echoed, never typed into a browser. Skip it and daemon start runs the same flow on first launch.

Open http://localhost:8765. A short setup wizard and guided tour take it from there: confirm your models, say who you are, run a starter task โ€” then hit Record and teach it something real.

The one-page guide: OPERATOR-SOP.md โ€” the record โ†’ approve โ†’ run โ†’ results loop, what each approval card means, and a troubleshooting table. New to the vocabulary? docs/glossary.md maps Systemu terms to industry ones.

Docker (Postgres-backed) and enterprise (Redis-scaled) modes:

git clone <this repo> && cd <repo>
python install.py --mode docker-local     # or docker-enterprise

What's in the box

  • Sharing-On (sharing_on) โ€” the capture engine: records screenshots, window switches, file changes, and input while you demonstrate a task, then turns the recording into accurate plain-English instructions.
  • Systemu runtime โ€” executes workflows through AI Shadow agents (specialists created per job, with your approval), a curated 41-tool registry that works out of the box, MCP connector support, episodic memory, and an evolution engine that proposes improvements from real runs. A Reverse-Harness Governor arbitrates the capabilities a running agent asks for, writes a per-run decision-audit trail, and โ€” opt-in โ€” can fan a decomposed goal out to a bounded fleet of parallel sub-agents.
  • Bring your own model โ€” choose a provider per tier: OpenRouter, Google, OpenAI, Anthropic (native SDK), or a local Ollama (keyless, on-device). Presets โ€” budget / balanced / quality โ€” set the cost/quality dial in one keystroke; sharing_on setup or Settings stores the keys.
  • The dashboard โ€” a command center: Home ยท Work ยท Shadows ยท Build ยท Insights ยท Settings, a persistent Needs you + Live rail, and one Decisions Inbox where every approval lands. Quick tasks answer in seconds from Chat; recorded workflows re-run in one click.

๐Ÿ“š More: Getting Started ยท Architecture ยท User Guide ยท Contributing


How it works

You perform a task on your computer
          โ”‚
          โ–ผ
Sharing-On records: screenshots, window switches,
  file changes, clipboard, process events
          โ”‚
          โ–ผ
Intent extractor (Tier-2 LLM) infers what you
  actually wanted โ€” written to intent.json, not
  inferred from the click sequence              (v0.6.0)
          โ”‚
          โ–ผ
Scroll refiner turns the intent + abstracted
  steps into a structured Scroll with objectives
          โ”‚
          โ–ผ
Pre-flight scroll validator (opt-in) checks
  satisfiability + intent-vs-tool fit;          (v0.4.0 + v0.6.0)
  surfaces a side-by-side remediation card
  with a proposed_revision when blocked         (v0.6.0)
          โ”‚
          โ–ผ
Activity extractor selects tools and skills
  via data-flow reasoning (schemas in headers,
  not just keyword name match)                  (v0.6.0)
          โ”‚
          โ–ผ
Missing tools forged with intent context โ†’
  dry-run validation gate (Gate 3.5)            (v0.5.0)
          โ”‚
          โ–ผ
Shadow decision picks an existing specialist OR
  creates a new one, scoring on semantic intent
  match plus skill/tool ID overlap              (v0.6.0)
          โ”‚
          โ–ผ
Supervisor dispatches the Shadow.  Intelligent
  Supervisor (opt-in) intervenes between
  iterations with bounded actions including
  RECALIBRATE_TOOL / RECALIBRATE_SKILL when
  capabilities are structurally inadequate
          โ”‚
          โ–ผ
Reverse-Harness Governor arbitrates capability
  requests the running Shadow PULLs โ€” a missing
  tool, a dependency, an escalation, or a fan-out
  to parallel sub-agents (opt-in).  Under the
  default risk-tiered gate mode it auto-grants
  low-risk requests and escalates the rest to the
  Decisions Inbox; on approval the run resumes.
  Every iteration's decision is written to a
  per-run decision-audit ledger
          โ”‚
          โ–ผ
Dashboard shows live progress, results,
  per-shadow + per-tool metrics, memory, and the
  Decisions Inbox for every operator gate

A deeper walkthrough of every stage lives in ARCHITECTURE.md.


Dashboard

The web dashboard (default http://localhost:8765) is organised as a six-spine command center. The left sidebar has exactly six entries:

Spine Route What it holds
Home / Overview โ€” stat cards, the workflow pipeline, and the live activity feed
Work /work The workflow-centric view; Scrolls + Activities fold in here
Shadows /shadows The Shadow roster (agent personas) and their per-shadow memory
Build /tools Tool registry (with an agent-built filter for tools Systemu forged itself); Skills and Evolution proposals fold in here
Insights /insights Memory, the capability flywheel, and the event stream (tabbed)
Settings /settings LLM tier config, the gate-mode dial, and approval defaults

Two surfaces are present on every page:

  • Right rail โ€” a persistent panel showing what Needs you (a glance at pending gates) and Live (a feed of in-flight runs). On narrow viewports it collapses to a "Needs you (N)" badge in the header.
  • Decisions Inbox (/inbox) โ€” the single place every approval gate lands as one unified card: scroll-approval, dependency, tool-forge, evolution, harness-escalation, and recovery gates. Approve executes โ€” approving a card runs the same action the CLI would (e.g. approving a scroll triggers activity extraction).

Gate modes

Settings exposes a gate-mode dial that controls how the runtime handles approval gates:

Mode Behaviour
Risk-tiered (default) The Governor auto-grants low-risk requests and escalates the rest to the Inbox
Approve-only Every gate waits for the operator
Bypass Auto-grants every gate except the safety floor (dependency/recovery gates) โ€” dev/test only

A safety floor keeps dependency and recovery gates interactive even under Bypass unless explicitly disabled. The same dial is available from the CLI via sharing_on decisions mode.

Legacy URLs still work. /army redirects to /shadows; /systemu-chat, /memory, /flywheel, and /notifications redirect into their merged tabs. The old /workshop route is gone โ€” its scroll rebuild is now an in-place dialog on the Scrolls view.


Prerequisites

Resource minimums (verified during the manual smoke run)

Resource local docker-local docker-enterprise
CPU cores 2 2 4
Free RAM 4 GB 6 GB 8 GB (Redis + Postgres + workers)
Free disk 2 GB 4 GB 6 GB
Network LLM API access + Postgres + Redis

Software

Requirement Version Notes
Python 3.10+ (3.12 tested) Required for all modes
pip latest pip install --upgrade pip
Git 2.30+ Required for ./install.sh
Docker 24+ / Desktop 4.x Required for docker-local and docker-enterprise
Node.js + npm 18+ Optional โ€” only for the Chrome capture extension

OS support

OS Native capture docker-* modes
Windows 10 / 11 โœ… verified โœ… verified
macOS 13+ โš ๏ธ partial โœ…
Ubuntu 22.04+ โš ๏ธ needs xdotool xclip โœ…

Linux capture extras:

sudo apt install xdotool xclip      # Debian / Ubuntu
sudo dnf install xdotool xclip      # Fedora

LLM access

Systemu calls models in three tiers and you choose a provider per tier โ€” mix and match, or use one everywhere. You need credentials for at least one of:

sharing_on setup collects the keys (hidden entry, stored in .env) and the Settings page lets you switch providers, models, and the budget / balanced / quality preset anytime.


Install from source (Docker & enterprise modes)

Installed with pip above? You're done โ€” this section is only for running the Docker / enterprise stacks or hacking on the code. Full walkthrough lives in docs/getting-started.md. The headline:

git clone https://github.com/rameswaran-mohan/project-systemu.git
cd project-systemu
./install.sh        # Linux/macOS    (or  install.bat  on Windows)
./start.sh          # Linux/macOS    (or  start.bat    on Windows)

install.sh asks which deployment mode you want and sets everything up. Three options:

Mode What you get Best for
local Native venv. Daemon + worker run as detached subprocesses. SQLite vault + Huey-SQLite broker. Single-machine dev / personal use.
docker-local docker-compose. Postgres vault + Huey-SQLite broker. One worker container. Hobbyist self-hosting on one box.
docker-enterprise docker-compose. Postgres vault + Redis broker. N worker containers (scale via WORKER_REPLICAS). Production / multi-host.

The dashboard runs at http://localhost:8765 in every mode. ./stop.sh (or stop.bat) shuts everything down cleanly.

To re-run installer after changing your mind: ./install.sh will detect the existing install and offer reconfigure / upgrade-deps / quit.

To upgrade an existing install to the latest release: ./update.sh (or update.bat). It stops the daemon, git pull --ff-onlys, reinstalls deps, runs alembic migrations, and restarts. Pass --yes / /y for non-interactive CI / cron usage. Refuses on a dirty working tree.

Non-interactive install (CI / automation)

./install.sh --mode docker-enterprise --non-interactive \
    --pg-password=hunter2 --redis-password=hunter3 \
    --worker-replicas=4 \
    --openrouter-key=sk-... --google-key=AIza...

Record a workflow (optional)

After ./start.sh:

sharing_on record --name "My workflow"
# Press Ctrl+C when done โ€” Systemu converts the recording into a Scroll

Windows note (v0.7.3): Use Ctrl+C directly in the same terminal where sharing_on record is running. Sending SIGINT from another process via kill -INT <pid> (e.g. from Git Bash or a background script) may not deliver the signal to the Python child reliably โ€” the session may stop without writing its final end_time, leaving session.json looking half-complete. Events in events.db are still complete and the session is fully usable by sharing_on analyze.

Export a recorded workflow as a portable Agent Skill

Once a recording has been analyzed, one command turns it into a portable Anthropic Agent Skill bundle that any Agent-Skills-compatible runtime (Claude Code, Cursor, etc.) can load:

sharing_on capture export-skill ./captures/<your_session_dir> \
           --output ./my-skill
# -> ./my-skill/<kebab-name>/SKILL.md

Validate the bundle with skills-ref validate ./my-skill/<kebab-name>.

Legacy / advanced Docker profiles

The original profiles are still in docker-compose.yml for backwards compatibility:

docker compose up systemu                          # legacy file backend
docker compose --profile docker-sandbox up systemu-docker   # tool sandbox

Migrating from a pre-pivot install

If you already have a JSON-vault deployment from before the holistic-enterprise pivot and want to move to docker-local or docker-enterprise, run the one-shot migration tool after spinning up the new Postgres:

# 1. Start the new stack so Postgres is up + tables created
./install.sh --mode docker-enterprise --skip-pull --pg-password=<your-pg> --redis-password=<your-redis>
docker compose --profile enterprise up -d postgres
alembic upgrade head     # creates tables in the new Postgres

# 2. Dry-run โ€” see what would migrate
python -m systemu.migrations.json_to_db \
    --source ./systemu/vault --dry-run

# 3. Run for real
python -m systemu.migrations.json_to_db \
    --source ./systemu/vault \
    --target "postgresql://systemu:<pg-password>@localhost:5432/systemu"

The migration is idempotent โ€” re-running it after fixing any errors leaves already-migrated rows untouched. See systemu/migrations/json_to_db.py for the source list (scrolls, shadows, tools, skills, activities, evolutions, chat history).

For Redis topologies beyond the default standalone (TLS, Sentinel, custom CA), see docs/redis-topologies.md.


Configuration

Every setting lives in your .env file โ€” copy .env.example (each variable is documented inline) as a starting point, or let sharing_on setup and the dashboard Settings page write them for you. The ones you'll actually touch:

Variable Default What it does
API key (one of) โ€” OPENROUTER_API_KEY (default, many models), GOOGLE_API_KEY, OPENAI_API_KEY, ANTHROPIC_API_KEY (needs systemu[anthropic]), or OLLAMA_URL (local, keyless). At least one.
SYSTEMU_MODEL_PRESET budget Cost/quality dial: budget | balanced | quality. Override any tier with SYSTEMU_TIER{1,2,3}_MODEL; an explicit tier model always wins.
SYSTEMU_STORAGE sqlite (local) file | sqlite | postgres โ€” set by install.py per mode.
SYSTEMU_DASHBOARD_PORT 8765 Dashboard port.
SYSTEMU_OUTPUT_DIR ~/Documents Where agent-generated files land.
SYSTEMU_NON_INTERACTIVE false Auto-pick the safe default in every approval prompt (dev/CI only).
SYSTEMU_DELEGATE_USE_PARALLEL false Opt in to parallel sub-agent fan-out for granted SUBAGENT requests.

The full set โ€” per-tier models + providers, queue/Redis, Docker host binds, the Intelligent-Supervisor budget knobs, pre-flight validators, recalibration auto-approve, persona dials, and capture intervals โ€” is documented inline in .env.example and editable from the Settings page.


Storage Modes

install.py writes SYSTEMU_STORAGE=sqlite to .env for local mode and postgres for docker-local / docker-enterprise. The in-process default when no env is set is file (kept for backward compat with pre-v0.3 installs).

SYSTEMU_STORAGE=sqlite (default for local mode)

  • SQLite database at SYSTEMU_DATABASE_URL, e.g. sqlite:///./data/systemu.db
  • Durable task queue with crash recovery + orphan requeue
  • Dashboard and worker run as separate processes
  • Alembic migrations run automatically on first start
  • Recommended for single-machine deployments

SYSTEMU_STORAGE=postgres (default for docker-local / docker-enterprise)

  • PostgreSQL backend (managed by docker-compose)
  • Multi-machine / multi-worker deployments
  • Same Alembic migrations as SQLite

SYSTEMU_STORAGE=file (legacy)

  • State stored as JSON files in systemu/vault/
  • Zero external dependencies
  • Kept for backward compatibility; use the migration tool below to move to SQLite or Postgres

Migrating from file โ†’ SQLite or Postgres:

SYSTEMU_STORAGE=sqlite SYSTEMU_DATABASE_URL=sqlite:///./data/systemu.db \
  python -m systemu.migrations.json_to_db --source ./systemu/vault --dry-run

See the Migrating from a pre-pivot install section above for the Postgres path.


Project Structure

project-systemu/
โ”œโ”€โ”€ sharing_on/                         โ€” Capture engine + analyser
โ”‚   โ”œโ”€โ”€ collectors/                       โ€” Screen, clipboard, file, window monitors
โ”‚   โ”œโ”€โ”€ analyzer/                         โ€” Step detector, narrative generator
โ”‚   โ”‚   โ”œโ”€โ”€ intent_extractor.py             โ€” v0.6.0 Tier-2 pre-pass that infers
โ”‚   โ”‚   โ”‚                                     outcome-oriented intent before the
โ”‚   โ”‚   โ”‚                                     narrative LLM runs (intent.json)
โ”‚   โ”‚   โ””โ”€โ”€ prompts/                        โ€” Analyzer prompt library
โ”‚   โ”œโ”€โ”€ output/                           โ€” instructions.md renderer
โ”‚   โ””โ”€โ”€ cli.py                            โ€” `sharing_on` command entry point
โ”‚
โ”œโ”€โ”€ systemu/                            โ€” Systemu runtime
โ”‚   โ”œโ”€โ”€ core/                             โ€” Pydantic models (Shadow, Scroll,
โ”‚   โ”‚                                       Activity, Tool, Skill, Objectiveโ€ฆ)
โ”‚   โ”œโ”€โ”€ pipelines/                        โ€” Stage 1โ†’6 transformations
โ”‚   โ”‚   โ”œโ”€โ”€ scroll_refiner.py               โ€” Stage 2 โ€” intent + objectives
โ”‚   โ”‚   โ”œโ”€โ”€ scroll_validator.py             โ€” Pre-flight intent-aware check
โ”‚   โ”‚   โ”œโ”€โ”€ scroll_remediator.py            โ€” v0.6.0 side-by-side fix card
โ”‚   โ”‚   โ”œโ”€โ”€ activity_extractor.py           โ€” Stage 3 โ€” schema-aware extraction
โ”‚   โ”‚   โ”œโ”€โ”€ skill_validator.py              โ€” v0.6.0 GUI-codification check
โ”‚   โ”‚   โ”œโ”€โ”€ skill_recalibrator.py           โ€” v0.6.0 re-author instructions_md
โ”‚   โ”‚   โ”œโ”€โ”€ tool_forge.py                   โ€” Spec โ†’ code โ†’ save (Gate 1/2)
โ”‚   โ”‚   โ”œโ”€โ”€ tool_dry_run.py                 โ€” v0.5.0 Gate 3.5 validation
โ”‚   โ”‚   โ”œโ”€โ”€ tool_recalibrator.py            โ€” v0.5.0 bump-vs-fork pipeline
โ”‚   โ”‚   โ”œโ”€โ”€ tool_inadequacy_diagnosis.py    โ€” v0.5.0 supervisor diagnosis
โ”‚   โ”‚   โ”œโ”€โ”€ shadow_decision.py              โ€” Stage 5 โ€” intent-aware tiebreak
โ”‚   โ”‚   โ”œโ”€โ”€ refinery.py                     โ€” Post-execution memory consolidation
โ”‚   โ”‚   โ”œโ”€โ”€ evolution_engine.py             โ€” Long-term shadow/skill evolution
โ”‚   โ”‚   โ”œโ”€โ”€ memory_consolidator.py          โ€” Tiered memory consolidation
โ”‚   โ”‚   โ”œโ”€โ”€ cross_shadow_patterns.py        โ€” Promotion of recurring lessons
โ”‚   โ”‚   โ””โ”€โ”€ workshop_module.py              โ€” Operator-driven scroll/shadow edit
โ”‚   โ”œโ”€โ”€ runtime/                          โ€” Shadow ReAct loop + Supervisor
โ”‚   โ”‚   โ”œโ”€โ”€ shadow_runtime.py               โ€” Per-shadow execute loop
โ”‚   โ”‚   โ”œโ”€โ”€ supervisor.py                   โ€” Activity queue + worker pool
โ”‚   โ”‚   โ”œโ”€โ”€ execution_mind.py               โ€” Intelligent Supervisor (v0.4.0)
โ”‚   โ”‚   โ”œโ”€โ”€ execution_snapshot.py           โ€” v0.5.1 true snapshot resume
โ”‚   โ”‚   โ”œโ”€โ”€ failure_classifier.py           โ€” 10-category failure taxonomy
โ”‚   โ”‚   โ”œโ”€โ”€ tool_metrics.py / shadow_metrics.py โ€” per-id telemetry
โ”‚   โ”‚   โ”œโ”€โ”€ affinity_log.py                 โ€” Activity-shadow routing memory
โ”‚   โ”‚   โ”œโ”€โ”€ inadequacy_tracker.py           โ€” Cross-shadow tool-inadequacy clustering
โ”‚   โ”‚   โ”œโ”€โ”€ rejection_store.py              โ€” Operator-feedback learning
โ”‚   โ”‚   โ”œโ”€โ”€ governor.py                      โ€” Reverse-Harness Governor (arbitrate + materialise capability PULLs)
โ”‚   โ”‚   โ”œโ”€โ”€ harness_arbiter.py               โ€” Deterministic GRANT/DENY/ESCALATE policy
โ”‚   โ”‚   โ”œโ”€โ”€ subagent_fleet.py / subagent_harness.py โ€” opt-in parallel child fan-out + partial-success collation
โ”‚   โ”‚   โ”œโ”€โ”€ decision_audit.py                โ€” per-iteration decision ledger (executions/<id>/decision_audit.jsonl)
โ”‚   โ”‚   โ”œโ”€โ”€ gate_mode_settings.py            โ€” Gate-mode dial (bypass / risk-tiered / approve-only) + floor
โ”‚   โ”‚   โ”œโ”€โ”€ tool_sandbox.py                 โ€” Subprocess / docker / wsl / ssh exec
โ”‚   โ”‚   โ””โ”€โ”€ tool_registry.py                โ€” Runtime tool loader
โ”‚   โ”œโ”€โ”€ interface/                        โ€” NiceGUI dashboard + REST API
โ”‚   โ”‚   โ”œโ”€โ”€ pages/                          โ€” Home, Work, Shadows, Build, Insights, Settings, Inbox, Chat
โ”‚   โ”‚   โ”œโ”€โ”€ command/                         โ€” Shared command layer (Inbox queue, gates, verbs)
โ”‚   โ”‚   โ””โ”€โ”€ cli_commands.py                  โ€” Systemu CLI groups (scrolls/army/tools/skills/decisions/โ€ฆ)
โ”‚   โ”œโ”€โ”€ messaging/                        โ€” Optional Telegram gateway
โ”‚   โ”œโ”€โ”€ prompts/                          โ€” Tier-1/2/3 prompt library
โ”‚   โ”œโ”€โ”€ queue/                            โ€” In-process / SQLite / Redis priority queues
โ”‚   โ”œโ”€โ”€ storage/sqlite/                   โ€” SQLite + Postgres vault (SQLAlchemy)
โ”‚   โ”œโ”€โ”€ vault/                            โ€” File-based vault + starter pack
โ”‚   โ”‚   โ”œโ”€โ”€ tools/                          โ€” Starter tool implementations
โ”‚   โ”‚   โ”œโ”€โ”€ shadow_army/                    โ€” Starter Shadow configurations
โ”‚   โ”‚   โ””โ”€โ”€ skills/                         โ€” Starter SKILL.md files (Anthropic
โ”‚   โ”‚                                         Agent Skills Standard compatible)
โ”‚   โ”œโ”€โ”€ scheduler/                        โ€” Daemon + recurring jobs
โ”‚   โ””โ”€โ”€ worker.py                         โ€” Background worker entry point
โ”‚
โ”œโ”€โ”€ alembic/versions/                   โ€” DB schema migrations (0001โ€“0010)
โ”œโ”€โ”€ extension/                          โ€” Chrome extension for web-event capture
โ”œโ”€โ”€ docs/                               โ€” Architecture, getting-started, messaging
โ”œโ”€โ”€ tests/                              โ€” pytest suite
โ”œโ”€โ”€ docker-compose.yml
โ”œโ”€โ”€ Dockerfile
โ”œโ”€โ”€ install.py / install.sh / install.bat
โ”œโ”€โ”€ start.sh / start.bat / stop.sh / stop.bat
โ””โ”€โ”€ .env.example

sharing_on Capture

sharing_on records what you do and produces:

captures/
โ””โ”€โ”€ my_task_cap_YYYYMMDD_HHMMSS/
    โ”œโ”€โ”€ instructions.md       โ† Step-by-step workflow guide
    โ”œโ”€โ”€ session.json          โ† Session metadata
    โ”œโ”€โ”€ events.db             โ† Raw captured events
    โ””โ”€โ”€ assets/               โ† Screenshots embedded in instructions.md

The instructions.md is converted into a Systemu Scroll when you submit the capture to the dashboard.

Privacy: keystrokes are NOT recorded; clipboard auto-redacts secrets; no data leaves your machine until the LLM analysis step.


CLI reference

Everything in the dashboard is also driven from the sharing_on CLI. Run sharing_on --help (or sharing_on <group> --help) for the full surface; the headline groups:

Command Purpose
sharing_on record / analyze Capture a workflow / re-analyze a recorded session
sharing_on init Seed the working-directory vault from the bundled starter catalog
sharing_on setup Pick the LLM provider + model preset per tier and store keys securely (hidden entry โ†’ .env); auto-runs on first daemon start if unconfigured
sharing_on daemon start / stop / status Run the background daemon + web dashboard
sharing_on doctor <id> Diagnose pending gates/blockers for a scroll/activity/shadow/tool (--apply to auto-fix)
sharing_on scrolls list / show / refine / approve Manage Scrolls (refined SOPs)
sharing_on army list / show / awaken / execute Manage and run Shadows
sharing_on tools list / forge / dry-run / enable / recalibrate Manage the tool registry + its forge gates
sharing_on skills list / export / deprecate Manage Skills (export to a portable Agent Skill)
sharing_on evolve run / show-pending / apply Run and apply the Evolution Engine
sharing_on decisions list / mode / resolve The Decisions Inbox from the terminal; mode sets the gate-mode dial
sharing_on chat submit / history Run a free-text task through the full pipeline
sharing_on settings show / set Inspect / write allow-listed configuration
sharing_on session ยท capability ยท skill ยท user Inspect episodic memory, the capability ledger, bundled skills, and your profile

Development

# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Apply database migrations
alembic upgrade head

# Generate a new migration after model changes
alembic revision --autogenerate -m "describe_change"

Contributing

Pull requests are welcome โ€” from humans and AI agents. See CONTRIBUTING.md for the contribution flow, including the explicit guidelines for AI-authored PRs.


Project status

Pre-1.0 โ€” current release v0.9.48. It's used daily, but APIs and behavior can still change. Full per-version history lives in release-notes/.

The arc so far: a capture engine + three deployment modes โ†’ the Intelligent Supervisor and tool-readiness pipeline (forge โ†’ dry-run โ†’ recalibrate) โ†’ intent-aware extraction โ†’ the Decisions Inbox + gate-mode dial โ†’ the Reverse-Harness Governor: capability-pull arbitration across all six families, scoped leases, a per-run decision-audit ledger, pip-first onboarding, per-tier providers (OpenRouter/Google/OpenAI/Anthropic/Ollama), MCP connectors, and opt-in parallel sub-agent fan-out.

What's next

The next-phase work is open for design. Likely candidates (not yet scheduled):

  • Auto-recalibration without operator approval for low-risk skill patterns (telemetry-gated promotion)
  • The remaining harness provisioners โ€” SKILL / ACCESS / COMPUTE (TOOL ships; SUBAGENT fan-out ships opt-in; ACCESS isolation is currently Docker-only / future work)
  • Recursive sub-agent decomposition (today's fleet is one level deep)
  • Multi-tenant deployment + per-operator vaults
  • Hosted catalog of community-contributed tools / skills

If you want to contribute, CONTRIBUTING.md is the contribution flow.


Troubleshooting

The fixes for what new users hit most:

  • Windows โ€” start.bat prints "the system cannot find the drive specified": cosmetic stderr from a stale PATH entry (an old mapped network drive). It doesn't affect startup โ€” remove the dead entry from PATH.
  • Linux โ€” capture records empty events: pynput needs X11, but Ubuntu/Fedora default to Wayland. Log in with an Xorg session (the daemon, dashboard, and tools work fine on Wayland โ€” only capture is affected) and sudo apt install xdotool xclip.
  • macOS โ€” capture is empty: grant Accessibility + Screen Recording to your terminal in System Settings โ†’ Privacy & Security, then restart the daemon.
  • HTTP 401 from OpenRouter: the key is mistyped, revoked, or lacks model access โ€” generate a fresh one at https://openrouter.ai/keys.
  • Daemon 500s with no such column: the DB schema is behind the code โ€” ./update.sh (or re-running the installer) applies the migrations.

More environment-specific fixes โ€” corporate proxy, Apple Silicon / Rosetta, Docker host binds, older Python โ€” are in USER_GUIDE.md. For anything else, the issue tracker.


License

MIT โ€” see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

systemu-0.9.48.tar.gz (2.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

systemu-0.9.48-py3-none-any.whl (1.9 MB view details)

Uploaded Python 3

File details

Details for the file systemu-0.9.48.tar.gz.

File metadata

  • Download URL: systemu-0.9.48.tar.gz
  • Upload date:
  • Size: 2.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for systemu-0.9.48.tar.gz
Algorithm Hash digest
SHA256 6a8f64ad10388bb59e93f7c466527a877060260838bc62935d3da79a1a68c555
MD5 94206679f2db80e94aa7b27a4259aab8
BLAKE2b-256 d44a1d84e531c4d9185a4ee0baf4802bfbb6d25870f73a22daa151175e157d8e

See more details on using hashes here.

File details

Details for the file systemu-0.9.48-py3-none-any.whl.

File metadata

  • Download URL: systemu-0.9.48-py3-none-any.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for systemu-0.9.48-py3-none-any.whl
Algorithm Hash digest
SHA256 f581fb70a548ca9e38d47e9e5415dbb550ba728f72e7d67468b952fbf6524cf5
MD5 97e6e5c99d43779bcff3fbd4a2aaa09b
BLAKE2b-256 9145390d6ed3c4c758f58b28882ba2d765eea94cff986c453bdf8f3c5bba7daa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page