Skip to main content

WerkGraph local-first coordination graph for humans and agentic tools.

Project description

WerkGraph

A cold-start orientation for AI agents. It explains what WerkGraph is, why it exists, and how to participate. Implementation details live in the code, the migrations, and the per-capability plan docs. This file is the durable mental model.

Looking for install steps? See docs/install/quickstart.html for the 5-minute path, docs/install/agent-keys.html for per-MCP-client config snippets, and docs/install/troubleshooting.html for common issues.

Working on Tool Registry? Start with docs/adr/ADR-WG-039-tool-registry-v0-routing-auth-dispatch.md, docs/tool-registry-operator-guide.md, and docs/tool-registry-migration-guide.md.

Tool Registry launch note

Tool Registry launch flows are Discovery-first. Actors use the WerkGraph Discovery MCP to search for call_tool, describe it, and invoke registered Tools through WerkGraph policy, audit, route/auth boundaries, and evidence capture. Do not add registered Tool MCP servers directly to actor clients.

The Tool Registry web UI is readonly as an inspection and accountability surface. Registered Tool methods are not inherently readonly: mutating methods must be explicitly allowed by Tool policy and remain routed through WerkGraph. WerkGraph stores only opaque route refs, non-secret route metadata, auth state, policy, and audit/evidence refs; upstream API keys, OAuth tokens, bearer tokens, passwords, broker credentials, and local .env secrets stay outside WerkGraph.

What WerkGraph is

WerkGraph is a graph of the work and the artifacts that work produces. Every Project, Workflow, Job, Decision, Learning, Component, Objective, Visual, claim, submission, contract, run, and edge between them is a node or edge in the graph. The only thing the graph does not hold is the code itself — code lives in git. Everything else — what was planned, what was decided, what was learned, what was done, by whom, with what evidence — lives here.

It is read through equivalent surfaces (REST API, CLI, MCP server, web UI) and written to through an audited operation layer that guarantees every state change is paired with its receipt.

It is not a chat memory store, not an LLM tracing tool, not a code knowledge graph, and not a project-management UI bolted onto AI. It is the work-side substrate that other tools in those categories do not provide.

Why WerkGraph exists

Every team running AI agents on real work hits the same gap: when an agent acts, no system of record captures what it did, why it did it, what it depended on, or how that propagates downstream. Issue trackers track human tickets. LLM observability tools trace single runs. Coding assistants ship pull requests. None of them join those signals into a queryable graph that survives the session, the model swap, or the team turnover.

WerkGraph closes that gap. Every claim, submit, decision, learning, and link is a row in an append-only audit log, written in the same database transaction as the state change it records. Every Job carries an inline contract that says what done looks like. Every Decision has provenance and can be superseded without losing history. The graph is the source of truth; the audit log is the receipt.

The mental model

The canonical entity types and what each one represents:

  • Project — the top of a body of work. A long-lived container with its own goal, policy, and inheritance scope.
  • Workflow — a coherent stream of work inside a Project. Holds Jobs in sequence, carries patterns and shared context. (Previously called Pipeline in older docs.)
  • Job — a single unit of work, claimable by one Actor at a time, carrying an inline contract that defines its definition of done.
  • Actor — an authenticated identity. Can be a human or an AI agent. Holds API keys, claims Jobs, signs every audited action.
  • Decision — a durable rule or choice made during work. Can be superseded by a newer Decision; old ones are never deleted.
  • Learning — a durable lesson captured during work. Like a Decision but for "what we learned" rather than "what we chose."
  • Component, Objective, Visual — structured artifacts that attach to Projects, Workflows, and Jobs, giving them shape without forcing everything into unstructured prose.
  • Label and Subject Tag — lightweight cross-cutting categorization. Labels for filtering; Subject Tags for cross-artifact subject identity.
  • Job Comment — agent-written audit-trail commentary on a Job, separate from the contract. Human rationale belongs in Decisions and Learnings.
  • Run — a view derived from the audit log capturing one claim-to-submit lifecycle of a Job.
  • Audit Log Entry — the atomic write record. Every mutation produces exactly one, in the same transaction as the change.
  • Edges — typed links between Jobs (gated_on, parent_of, sequence_next) and between Jobs and Decisions / Learnings (references).

Decisions and Learnings can attach to a Project, Workflow, or Job. They inherit downward — a Decision on a Project is visible to every Workflow and Job under it.

How an agent participates

The loop is small and you should learn it before doing anything else:

  1. Claim a Job. The substrate gives you exactly one Job at a time, with a heartbeat lease. While you hold it, no other Actor can claim it.
  2. Read the context. The claim returns navigation to the Project, Workflow, previous Jobs in sequence, the current Job, the next Job, and inherited Decisions and Learnings. You follow links from there to read what you need.
  3. Do the work. Make changes, run commands, capture evidence. If you make a Decision worth preserving, record it. If you Learn something worth preserving, record it.
  4. Heartbeat periodically. If you take too long without signaling life, the substrate will auto-release the Job so it can be reclaimed.
  5. Submit with an outcome. One of: done, pending_review, failed, blocked. The submission is validated against the Job's inline contract. Decisions and Learnings captured inline are persisted in the same transaction as the state change.
  6. Move on. Claim the next Job.

If you get stuck, you can release_job to put it back. If something is wrong with the claim, reset_claim clears it. If a long-running work block outlives the claim window, the original claimant can call renew_claim (POST /jobs/{id}/renew, wg job renew <id>, or MCP renew_claim) to refresh claim_heartbeat_at while preserving attribution. Renewal works before sweep, or after auto-release inside WG_CLAIM_RENEWAL_GRACE_SECONDS when no other actor has claimed the Job; failures are forbidden, lease_not_renewable_other_actor_claimed, or lease_not_renewable_grace_window_exceeded. If you need to flag a related Job as a blocker, the blocked outcome plus a gating reference does that.

Local development workflow

Local PR verification should use isolated Docker Compose project names so test databases, networks, and ports do not collide with long-lived stacks. The default naming convention for short-lived PR stacks is:

  • wg-pr<PR number>, for example wg-pr123
  • wg-<slug>-test or wg-<slug>-test-<suffix> when a PR number is not part of the local stack name

After PRs merge or close, run the dry-run cleanup report:

wg dev reap-stacks

The command enumerates local Compose projects matching the convention, asks GitHub for each PR state through the authenticated gh CLI, and prints what it would remove. Destructive cleanup is opt-in:

wg dev reap-stacks --apply

By default, merged PR stacks are skipped for 30 minutes after merge to avoid racing an in-flight rebuild or teardown. Adjust that with --grace-minutes. Operators using a different local naming convention can set WG_PR_STACK_PATTERN to a regex; include a named (?P<pr>\d+) capture, or the first numeric capture group, when the command should correlate a stack with a GitHub PR. Use --repo owner/name when the local git remote is a fork but cleanup should check upstream PR state.

Architectural principles that do not change

These are the locks. They shape every capability decision and are unlikely to ever reverse.

  • Same-transaction audit guarantee. Every state change and its audit row commit together or roll back together. There is no path that mutates state without also writing an audit entry.
  • Four-surface byte-equality. Every operation is reachable from REST, CLI, MCP, and (where applicable) UI, and returns the same payload from each. The surfaces are thin wrappers over a single in-process service layer.
  • Atomic claim with heartbeat lease. Job claiming is a single transaction with row-level locking. Heartbeats keep a lease alive; expiry auto-releases.
  • Attribution-preserving renewal. Renewals are audited as renewed_claim, not a second claim_job, so the timeline shows continuous work by the original claimant.
  • Inline contracts on every Job. Every Job carries a structured contract as a column on the row, not as a reference to an external profile. The contract is the durable definition of done.
  • Bi-temporal supersede semantics. Decisions can be superseded; the old row stays, the new row references the old. History is queryable, never destroyed.
  • Append-only audit log. Entries are never updated or deleted. The log is the receipt of every action ever taken.
  • First-class Decisions and Learnings. Rationale and lessons are entities with their own tables, ops, and inheritance — not comments stapled to Jobs.
  • No overengineering of speculative load. Features like size budgeting, hard-refuse gating, or speculative model registries are explicitly out. The substrate sizes for actual measured load, not imagined load.
  • No external profile registries. Contracts live on the Job. Skills live in code. There is no separate registry to keep in sync.
  • Reads are never audited; mutations always are. Reads are free and fast. Mutations always pay the audit cost.

What WerkGraph is not, and will not become

  • Not a chat assistant. Not an LLM. Not a model router.
  • Not a code knowledge graph (that category is solved by other tools).
  • Not an LLM observability tool (that category is solved by other tools).
  • Not a peer to issue trackers — it is the substrate intended to replace them for AI-agent-coordinated work. References to external trackers in older docs are operational, not architectural.
  • Not a fully-autonomous-by-default system. The substrate enables supervised orchestration; autonomy is a caller choice, not a default.

The three-layer vision

WerkGraph is the bottom layer of a three-layer architecture. The other two layers are scoped but unbuilt at the time of this writing.

  • WerkGraph (substrate) — the entity graph, the audit log, the contracts, the claim/submit loop. This repo.
  • WerkForce (orchestration) — multi-agent delegation, session tracking, mid-flight intervention. Wraps subprocess agent CLIs and exposes them as orchestration tools to a supervising agent.
  • Werker (identity) — per-agent identity, capability tags, cost tier, model routing. Lives inside WerkForce.

Until WerkForce ships, the orchestrating agent is whichever AI session a human starts. The substrate doesn't care; it just records what each authenticated Actor does.

The verify-don't-trust rule

This applies to humans and agents alike. Every claim about the substrate's state must be verifiable from the audit log or the entity graph. Do not paraphrase what an output "would have been." Do not summarize agent self-reports as fact. If you didn't run the command, mark the result blocked, not done. A submission with fabricated evidence is worse than a submission that admits the work isn't finished.

The one-line summary

WerkGraph is the durable record of AI-agent work — what was done, why, against what contract, by whom, with what effect — written atomically alongside the state changes it describes, queryable as a graph, and built to outlast any individual session, model, or operator.


Created by and Maintained by Mario Watson. Licensed AGPL-3.0-or-later. Bug reports and patches: GitHub issues and pull requests. See CONTRIBUTING.md for local development commands.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

werkgraph-0.1.15-py3-none-any.whl (18.8 kB view details)

Uploaded Python 3

File details

Details for the file werkgraph-0.1.15-py3-none-any.whl.

File metadata

  • Download URL: werkgraph-0.1.15-py3-none-any.whl
  • Upload date:
  • Size: 18.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for werkgraph-0.1.15-py3-none-any.whl
Algorithm Hash digest
SHA256 ff2b26ac80b6291c6c992d636e22bad8622d828e5162331bd81243e62e98d83d
MD5 6e090ee931c9cbbd52c6f3e5cb0ec523
BLAKE2b-256 b5b85648afc8f9dfaa38f52aded539b12d94c411499e4df86cc27bc3b387b275

See more details on using hashes here.

Provenance

The following attestation bundles were made for werkgraph-0.1.15-py3-none-any.whl:

Publisher: publish.yml on werkgraph/wg

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page