Skip to main content

AI-native operations runtime for vssh-backed infrastructure

Project description

MeshClaw

MeshClaw is an AI-native operations runtime for real infrastructure.

It is not a chatbot, not a generic assistant, and not a replacement for Codex, Claude, SSH, Docker, or Kubernetes. Users talk to Codex, Claude, ChatGPT, Open WebUI/local models, or another operator frontend. MeshClaw provides the runtime layer those operators call for truthful infrastructure state, policy decisions, safe execution, workflow evidence, service diagnostics, cleanup plans, capability registry, and auditability.

The product direction is fixed in docs/SURVIVAL_DIRECTION.md: MeshClaw is the shared operational truth and execution-control layer for Codex, Claude, local models, Matrix operations rooms, and human operators.

User
  -> Codex / Claude / Open WebUI / Matrix ops room
  -> MeshClaw MCP / CLI
  -> inventory + capabilities + policy + workflows + evidence + audit
  -> vssh-native over Tailscale / provider APIs / monitor agents / adapters
  -> servers / models / APIs / temporary capacity

Scope

MeshClaw owns:

  • server inventory
  • model and API capability registry
  • workspace registry: which model/human is working on which server/folder
  • capacity and budget facts
  • fleet status
  • policy answers
  • safe remote execution
  • server operations agent workflows
  • log analysis
  • security checks
  • provision/bootstrap/deprovision hooks
  • diagnostics and repair plans
  • service/log/deploy runbooks
  • audit and evidence
  • CLI, dashboard, and MCP surfaces
  • AI-operator friendly outputs for Codex and Claude

MeshClaw does not own:

  • general chat
  • assistant personality
  • Matrix-first personal assistant behavior
  • Siri or Shortcuts automation
  • mail, calendar, browser, or lifestyle assistant tools
  • multi-agent roleplay or broad coworker orchestration
  • coding-agent replacement workflows

Install

Public install target:

pip install -U meshclaw vssh
meshclaw --install-binary
meshclaw --print-binary
meshclaw mcp

The meshclaw PyPI package is a Python entrypoint that finds the real Go runtime. If no runtime binary is present, it tries to bootstrap one with go install into ~/.local/bin. Operators can override that behavior:

MESHCLAW_BIN=/path/to/meshclaw meshclaw help
MESHCLAW_AUTO_INSTALL=0 meshclaw --no-auto-install help
MESHCLAW_INSTALL_DIR=/opt/meshclaw/bin meshclaw --install-binary

During local development the Go binaries can be built directly:

go build -o /Users/dragon/bin/meshclaw ./cmd/meshclaw
cd /Users/dragon/meshpop-repos/vssh && go build -o /Users/dragon/bin/vssh ./cmd/vssh

First Test

Run these before connecting an AI client:

meshclaw workflows
meshclaw workflows inspect fleet-health-demo --json
meshclaw run fleet-health-demo --dry-run --json
meshclaw evidence open latest
meshclaw nodes list
meshclaw capabilities
meshclaw monitor-check
meshclaw autoheal-plan --json
meshclaw doctor d1
meshclaw service-audit d1
meshclaw data-clean-plan d1 /home
meshclaw policy-check codex data_clean_apply server
meshclaw evidence-list 10
meshclaw mcp

Expected behavior:

  • fleet-health-demo creates a repeatable evidence bundle without requiring private email, DNS, or provider credentials
  • the bundle includes plan.md, execution.json, steps.jsonl, meshclaw-actions.md, and report.html
  • read-only tools return structured facts and evidence paths
  • autoheal-plan marks each action with policy_decision and approval_required
  • autoheal-apply-safe executes only mode=auto_safe, policy_decision=allow, approval_required=false actions
  • cleanup starts with data-clean-plan; destructive apply requires approval and manifest evidence

Developer Commands

The full local command surface is broader:

go run ./cmd/meshclaw direction
go run ./cmd/meshclaw list
go run ./cmd/meshclaw capabilities
go run ./cmd/meshclaw capabilities init --force
go run ./cmd/meshclaw status
go run ./cmd/meshclaw monitor-check
go run ./cmd/meshclaw ops-control
go run ./cmd/meshclaw ops-control --apply-safe
go run ./cmd/meshclaw monitor-agent 5m
go run ./cmd/meshclaw monitor-agent 10m --hygiene
go run ./cmd/meshclaw fleet-scan --hosts d1,v1 --security --hygiene --logs --json
go run ./cmd/meshclaw service-triage --limit 5
go run ./cmd/meshclaw autoheal-plan
go run ./cmd/meshclaw autoheal-apply-safe
go run ./cmd/meshclaw disk-investigate d1 /home/dell
go run ./cmd/meshclaw data-clean-plan d1 /home/dell/kobolt
go run ./cmd/meshclaw data-clean-apply d1 /tmp/meshclaw-data-clean-plan-d1-...
go run ./cmd/meshclaw policy-check codex read_state server
go run ./cmd/meshclaw policy-show
go run ./cmd/meshclaw policy-init --preset devops
go run ./cmd/meshclaw policy-presets
go run ./cmd/meshclaw matrix-plan
go run ./cmd/meshclaw workers
go run ./cmd/meshclaw workspace-list
go run ./cmd/meshclaw workspace-add meshclaw-local local /Users/dragon/meshclaw codex serverops
go run ./cmd/meshclaw workspace-activity meshclaw-local codex edit "added workspace registry"
go run ./cmd/meshclaw ops-chat
go run ./cmd/meshclaw ops-dispatch matrix "!workers"
go run ./cmd/meshclaw ops-dispatch openwebui "workspaces"
go run ./cmd/meshclaw evidence-list 10
go run ./cmd/meshclaw ai-guide --json
go run ./cmd/meshclaw tool-recommend "d1 disk cleanup and duplicate checkpoint removal" --json
go run ./cmd/meshclaw adapters --json
go run ./cmd/meshclaw workflows
go run ./cmd/meshclaw workflows inspect fleet-health-demo --json
go run ./cmd/meshclaw run fleet-health-demo --dry-run --json
go run ./cmd/meshclaw workflows inspect meshclaw-ops-orchestration-demo --json
go run ./cmd/meshclaw run meshclaw-ops-orchestration-demo --dry-run --json
go run ./cmd/meshclaw workflows inspect email-orchestration-demo --json
go run ./cmd/meshclaw workflows resume latest --json
go run ./cmd/meshclaw approvals grant latest send-approval --actor dragon --reason "approved test email send"
go run ./cmd/meshclaw approvals list latest
go run ./cmd/meshclaw run email-orchestration-demo --execute --approvals latest --json
go run ./cmd/meshclaw run email-orchestration-demo --dry-run --step send-approval --json
go run ./cmd/meshclaw run meshclaw-runtime-why-demo --dry-run
go run ./cmd/meshclaw run meshclaw-runtime-why-demo --execute
go run ./cmd/meshclaw evidence open latest
go run ./cmd/meshclaw run d1 'hostname && uptime'
go run ./cmd/meshclaw doctor d1
go run ./cmd/meshclaw analyze-logs d1 syslog
go run ./cmd/meshclaw service-check v3 server-agent.service
go run ./cmd/meshclaw service-remove v3 walknews.service /root/walknews
go run ./cmd/meshclaw security-check d1
go run ./cmd/meshclaw hygiene-plan d1
go run ./cmd/meshclaw hygiene-scan-host d1
go run ./cmd/meshclaw provision-plan batch-log-analysis 10
go run ./cmd/meshclaw mcp

The execution path is vssh-native first over Tailscale/private network. SSH is only a fallback for nodes that do not have vssh server running yet. Wire remains legacy compatibility.

Default remote execution requires:

Tailscale/private route + vssh server + VSSH_SECRET

Fallback execution still needs Tailscale + sshd + SSH key/user mapping.

Product Claim

Kubernetes is for orchestrating containerized workloads. MeshClaw is for operating the servers that already exist: VPS nodes, home servers, GPU boxes, NAS devices, Docker hosts, mail servers, and small private infrastructure. When existing capacity is not enough, MeshClaw exposes approved provisioning hooks so an AI operator can plan, rent, bootstrap, attach, use, and tear down temporary servers under policy.

Agent Workflows

MeshClaw should expose repeatable infrastructure workflows as MCP tools and CLI commands. These workflows return structured findings, risk levels, evidence, and recommended next actions. Codex, Claude, or a local model explains and coordinates the plan; MeshClaw supplies the operational truth.

Initial workflows:

  • fleet-health-demo: generic server operations workflow loaded from workflows/fleet-health-demo.json; this is the preferred OSS-facing example because it is not tied to email
  • meshclaw-ops-orchestration-demo: combined AI-operator workflow loaded from workflows/meshclaw-ops-orchestration-demo.json; it explains why MeshClaw and vssh are needed when Codex/Claude can do one-off manual orchestration, then ties together fleet state, mail/DNS approval gates, Ollama worker lanes, service triage, autoheal planning, cleanup planning, screenshot evidence, and a final runtime report
  • meshclaw-runtime-why-demo: prove why MeshClaw exists when Codex/Claude can already do the work; render the positioning artifact and write a runtime evidence bundle
  • ollama-orchestration-demo: replay model-worker orchestration with structured failures and evidence
  • email-orchestration-demo: replay mail/DNS/Mox operations with approval gates for real sends or provider changes
  • doctor: diagnose reachability, services, capacity, and runtime health
  • monitor-check: check the whole fleet and store evidence
  • ops-control: summarize fleet health, service risks, auto-safe candidates, next commands, and evidence in one server-management control report
  • monitor-agent: continuously collect fleet state and alert evidence; with --hygiene, it also stores redacted sensitive-data leak findings
  • fleet-scan: run monitor, security, logs, and redacted hygiene checks across selected hosts and store one evidence bundle for AI review
  • autoheal-plan: convert fleet alerts and service triage into structured actions with policy_decision and approval_required
  • autoheal-apply-safe: execute only plan actions where mode=auto_safe, policy_decision=allow, and approval_required=false; all other plan actions are skipped with evidence
  • disk-investigate: collect disk evidence without deleting data
  • data-clean-plan: find raw/intermediate/checkpoint cleanup candidates, preserve clean/final outputs, and write both a manifest and structured JSONL sidecar with category, risk, size, and reason
  • data-clean-apply: apply a manifest generated by data-clean-plan; policy requires approval for real deletion
  • analyze-logs: summarize recent logs, detect errors, and cite evidence
  • service-check: collect read-only systemd status, unit config, and logs
  • service-triage: run service audit, inspect top candidates, and classify them as real incidents, stale boot-only findings, ignore candidates, or approval-required actions
  • service-quarantine: disable a flapping service only when its ExecStart target is missing
  • service-remove: stop/disable a local systemd service, remove its local unit, and optionally remove its matching working directory
  • security-check: check SSH exposure, users, updates, firewall, open ports, failed logins, risky services, and secret handling
  • hygiene-plan: continuously detect sensitive data leaks, log leaks, risky permissions, and safe remediation opportunities
  • hygiene-scan-host: scan likely remote logs/config files for redacted secret and PII leak evidence without storing raw values
  • capacity-plan: decide whether existing servers are enough
  • provision-plan: propose temporary VPS/GPU capacity under budget policy

Runtime workflows can be built into the binary or loaded from JSON files in ./workflows, ~/.meshclaw/workflows, or directories listed in MESHCLAW_WORKFLOW_DIR. This keeps domain workflows such as email, DNS, VPS provisioning, or browser automation outside the runtime core.

meshclaw adapters [--json] lists runtime adapters and whether they actually execute. local and vssh are executable today. manual, policy, mail, dns, browser, and cloud are evidence-only placeholders until concrete adapters are configured.

Workflow execution has bounded step timeouts so one slow remote command cannot hold the whole run indefinitely. Defaults are local=90s and remote=15s; override with MESHCLAW_WORKFLOW_LOCAL_TIMEOUT_SECONDS and MESHCLAW_WORKFLOW_REMOTE_TIMEOUT_SECONDS when running heavy diagnostics.

Every workflow bundle includes an AI handoff section in meshclaw-actions.md and report.html. The handoff tells Codex, Claude, or a local model to treat execution.json as source of truth, steps.jsonl as the timeline, and approval-required skips as intentional gates rather than failures.

meshclaw workflows inspect <name> is the preflight view for Codex, Claude, and local LLMs. It returns the workflow steps, approval gates, required adapters, required nodes, policy decisions, and matching capability IDs before anything is executed.

meshclaw workflows resume [latest|bundle|execution.json] reads a previous workflow evidence bundle and writes resume-plan.json. It does not execute anything. It classifies failed, retryable, approval-pending, and dry-run ready-for-execute steps so an AI operator can continue from evidence instead of reconstructing state from chat history. Resume items include action, resource, approval actor, approval time, and approval source when available.

meshclaw approvals grant ... appends an approval record to approvals.jsonl inside the workflow evidence bundle. workflow resume reads that file and changes matching steps from approval_pending to approved_ready. Approval records identify actor, workflow, step, action, resource, reason, source, timestamp, and bundle path.

meshclaw run <workflow> --execute --approvals latest loads approval records from a previous evidence bundle and records matching approval metadata on each execution result. Approval-gated local and vssh steps can execute after approval. Approval-gated non-executable adapters such as policy, manual, mail, or provider steps are kept as structured skipped results until a specific adapter exists, so approvals are auditable but not over-interpreted. Use --step <id[,id]> to rerun only selected workflow steps from a resume plan instead of replaying the whole workflow.

Policy is loaded from ~/.meshclaw/policy.json, or from MESHCLAW_POLICY_FILE when set. Configured rules are evaluated before the built-in safety defaults, so operators can grant or restrict Codex, Claude, local LLMs, and automations without changing code.

The capability registry is loaded from ~/.meshclaw/capabilities.json, or from MESHCLAW_CAPABILITY_FILE when set. meshclaw capabilities init --force writes a starter registry, then MeshClaw merges it with inventory-discovered node capabilities such as GPU workers, NAS/storage nodes, mail servers, and automation lanes. Secrets remain use-only; capability listings describe what is available without revealing credentials.

Runtime workflow evidence bundles also snapshot the capability registry:

evidence/latest/
  plan.md
  execution.json
  steps.jsonl
  capabilities.json
  meshclaw-actions.md
  report.html

This makes a Codex/Claude operation reproducible: the model can see not only what steps ran, but also what servers, model lanes, storage nodes, APIs, and approval-gated capabilities were available at that moment.

Natural-language conversation is owned by Codex, Claude, ChatGPT, Open WebUI, or another model frontend. MeshClaw makes that conversation operational by exposing MCP tools, policy decisions, vssh execution, and evidence.

Hygiene workflows are allowed to auto-apply only safe repairs such as permission hardening, redacted log copies, and quarantine. Destructive actions, secret rotation, database edits, service restarts, and provider revocation need approval.

Non-Conversation Rule

All natural-language planning belongs to Codex, Claude, ChatGPT, local LLMs, or another operator frontend. MeshClaw interfaces return structured facts and action results. Matrix is allowed as an operations room, notification channel, approval channel, and optional MCP command surface; it is not the assistant brain.

Real Matrix bridge commands:

meshclaw matrix-config-init --force
meshclaw matrix-post "MeshClaw Matrix bridge connected"
meshclaw matrix-sync-once
meshclaw matrix-bridge

Archived Previous Version

The previous broad personal-AI-runtime version was archived outside this repo:

/Users/dragon/meshclaw-archive-20260516-serverops-pivot

Current Handoff

See:

docs/HANDOFF_2026-05-16.md
docs/MCP_SETUP.md

MCP

Run:

meshclaw mcp

AI operator rule of thumb:

  • Use MeshClaw MCP for policy, state, capability registry, workflow runs, approval boundaries, and evidence.
  • Use direct vssh only for low-level structured remote execution, typed facts, daemon RPC, parallel execution primitives, or debugging MeshClaw adapters.
  • Prefer meshclaw_run_evidence over raw vssh/SSH when Codex/Claude needs an audit trail.
  • Prefer meshclaw_workflow_run over reconstructing multi-step operations from chat history.
  • Use meshclaw_tool_recommend when an AI operator is unsure whether an intent belongs in MeshClaw, direct vssh, or a safer plan/apply workflow.

Canonical tools:

  • meshclaw_ai_guide
  • meshclaw_tool_recommend
  • meshclaw_workflow_list
  • meshclaw_workflow_run
  • meshclaw_evidence_latest
  • meshclaw_ops_control
  • meshclaw_run_evidence

Tool surface:

  • meshclaw_server_list
  • meshclaw_ai_guide
  • meshclaw_tool_recommend
  • meshclaw_workers
  • meshclaw_workspace_list
  • meshclaw_workspace_add
  • meshclaw_workspace_activity
  • meshclaw_capability_list
  • meshclaw_monitor_check
  • meshclaw_autoheal_plan
  • meshclaw_autoheal_apply_safe
  • meshclaw_workflow_list
  • meshclaw_workflow_run
  • meshclaw_evidence_latest
  • meshclaw_evidence_list
  • meshclaw_policy_check
  • meshclaw_policy_show
  • meshclaw_matrix_plan
  • meshclaw_ops_dispatch
  • meshclaw_provision_plan
  • meshclaw_run_evidence
  • meshclaw_disk_investigate
  • meshclaw_data_clean_plan
  • meshclaw_data_clean_apply
  • meshclaw_service_check
  • meshclaw_service_audit
  • meshclaw_service_triage
  • meshclaw_service_quarantine
  • meshclaw_service_remove
  • meshclaw_fleet_scan
  • meshclaw_fleet_service_audit
  • meshclaw_security_check
  • meshclaw_hygiene_scan_host
  • meshclaw_node_repair_plan
  • meshclaw_vssh_daemon_audit
  • meshclaw_vssh_auth_paths
  • meshclaw_process_top
  • meshclaw_orchestration_plan
  • meshclaw_placement_plan
  • meshclaw_workflow_inspect
  • meshclaw_workflow_resume
  • meshclaw_approvals_list
  • meshclaw_approvals_grant
  • meshclaw_job_start
  • meshclaw_job_status
  • meshclaw_job_logs
  • meshclaw_job_cancel
  • meshclaw_artifact_collect

Tool names use underscores. Dotted names such as meshclaw.autoheal_plan are legacy documentation bugs and are not valid MCP tool names.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

meshclaw-1.2.36.tar.gz (178.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

meshclaw-1.2.36-py3-none-any.whl (10.5 kB view details)

Uploaded Python 3

File details

Details for the file meshclaw-1.2.36.tar.gz.

File metadata

  • Download URL: meshclaw-1.2.36.tar.gz
  • Upload date:
  • Size: 178.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for meshclaw-1.2.36.tar.gz
Algorithm Hash digest
SHA256 70456ce27503ff53ab8b2839c631e5da27d20e6f51f0a5622e3db87783aba4f6
MD5 3961ad17ba9c5153e4849ba375fe77cf
BLAKE2b-256 91c1ac88da1327ffe887731664ab7b98831160a85e2b3b897f57c876ec7fff5d

See more details on using hashes here.

File details

Details for the file meshclaw-1.2.36-py3-none-any.whl.

File metadata

  • Download URL: meshclaw-1.2.36-py3-none-any.whl
  • Upload date:
  • Size: 10.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for meshclaw-1.2.36-py3-none-any.whl
Algorithm Hash digest
SHA256 73fdf7bde2db03cebf727063314ba88dad72bd122a3cd0b49838ac5c79a2dffa
MD5 b8eff3b7fd0c53c1bd7bf3ac17395e3a
BLAKE2b-256 4b235adbc9784cc71afefd2d0bd890310aa6b8be74bacd14b9eba8b0cca4514b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page