Skip to main content

Intent-layer enrichment for graphify knowledge graphs: extract the decisions, mechanisms, constraints, and trade-offs behind your docs as grounded, anchored graph nodes.

Project description

graphify-intent

PyPI version Python versions License: MIT CI Ruff

Post-process a graphify knowledge graph with an intent layer — the why behind your docs. Where graphify extracts what exists, graphify-intent extracts the decisions, mechanisms, constraints, and trade-offs your prose encodes, each with a grounded rationale, and anchors them to the concepts graphify already found. A pipeline of LLM passes — extract → anchor → cross-doc relate, plus an opt-in cross-document concept-resolution pass — produces a sidecar JSON and an enriched graph.json.

flowchart LR
    D["docs/*.md"] --> A
    G["graph.json"] --> B
    A["Pass A<br/>extract intent"] --> B["Pass B<br/>anchor to concepts"]
    B --> C["Pass C<br/>cross-doc intent"]
    B --> R["Pass D<br/>concept resolution<br/>(opt-in)"]
    A --> OUT
    C --> OUT
    R --> OUT
    OUT["outputs:<br/>.graphify_intent.json<br/>graph.enriched.json<br/>enrichment_report.md"]

Requirements

graphify-intent post-processes graphify graphs and depends on graphifyy (the graphify runtime that every backend calls) — it is installed automatically (ADR-0010). By default it uses your Claude Pro/Max subscription via the Claude Code CLI — no API key required. To use a billed API key instead, see LLM backend.

Installation

pip install graphify-intent          # brings graphifyy (the graphify runtime) with it
# or as a standalone tool:
uv tool install graphify-intent

For an API backend, add the provider extra so graphify has its SDK (the subscription backend needs neither):

pip install 'graphify-intent[anthropic]'   # Anthropic API
pip install 'graphify-intent[gemini]'      # Gemini API

To enable Pass D's embedding-based concept resolution, add the optional extra:

pip install graphify-intent[embeddings]   # local model2vec embedder for Pass D

Without it, Pass D still runs — it degrades gracefully to a lexical-similarity fallback.

Quick start

# Uses your Claude Pro/Max subscription by default (needs the `claude` CLI).
graphify-intent \
  --graph graphify-out/graph.json \
  --docs docs/ \
  --passes A,B,C

LLM backend

By default, graphify-intent uses your Claude Pro/Max subscription through the Claude Code CLI (claude) — no API key, no per-token cost. It falls back to a billed API key if the CLI isn't available.

--backend Behaviour
(unset) Subscription if the claude CLI is on your PATH, else a detected API key
subscription Force the Claude Code CLI (subscription auth)
api Force a billed API key (ANTHROPIC_API_KEY / GEMINI_API_KEY)
claude / gemini / … Force a specific graphify backend

You can also set GRAPHIFY_INTENT_BACKEND. For subscription mode, install the claude CLI and run it once to sign in. For API mode, install the matching extra (graphifyy[anthropic] or graphifyy[gemini]) and provide the key.

Keep API keys out of .env and your shell history

If you do use an API key, prefer a password-manager CLI over inlining the secret. With 1Password's op, inject it at runtime so it never lands on disk in plaintext:

export ANTHROPIC_API_KEY="$(op read 'op://<vault>/<item>/credential')"
graphify-intent --backend api --graph graphify-out/graph.json --docs docs/

…or wrap the command with op run so the secret lives only for that process:

op run --env-file=.env.op -- graphify-intent --backend api --graph graphify-out/graph.json --docs docs/

A real key committed in .env risks leaking into git history, CI logs, and backups; a manager keeps it encrypted, access-audited, and revocable. Best of all, the default subscription backend needs no key at all.

CLI flags

Flag Default Description
--graph graphify-out/graph.json Path to the graphify output graph.json
--docs docs/ Directory containing markdown source files
--passes A,B,C Comma-separated passes: A extract, B anchor, C cross-doc relate, D cross-doc concept resolution (opt-in)
--backend (auto) Backend selection (see LLM backend)
--min-confidence 0.75 Minimum confidence_score for Pass C intent edges
--max-concurrency 4 Maximum number of concurrent LLM requests
--max-cost-usd 2.00 Cost guard for API backends: prompts if the estimate exceeds this (skipped for the free subscription backend)
--yes false Skip the cost-guard prompt (non-interactive / CI)

Set --backend to subscription, api, or an explicit provider name (see LLM backend).

Outputs

All outputs are written to the same directory as graph.json:

File Contents
.graphify_intent.json Sidecar: all intent nodes and edges added across runs
graph.enriched.json Original graph merged with the intent layer (graphify node-link format)
enrichment_report.md Before/after metrics: node/link/isolated counts, intent breakdown by kind, anchored vs orphaned, grounding coverage

How it works

Pass A — intent extraction. Each markdown document is split into heading-bounded sections. Per section, the LLM extracts 0-N intent units — a decision, mechanism, constraint, or tradeoff — each with a one-sentence claim, the rationale (the why), any alternatives considered, and a deterministic ID grounded to the section's character span.

Pass B — anchoring. Each intent node is linked to the graph concept(s) it explains via a rationale_for edge. Candidate graph nodes are found without embeddings — same source_file plus lexical overlap — then the LLM adjudicates which the intent actually explains.

Pass C — cross-doc intent. A single LLM call over all intent nodes surfaces intent-level relationships between them — supersedes, trade_off_against, constrains, motivated_by, contradicts. Edges below --min-confidence are discarded.

Pass D — cross-doc concept resolution (opt-in). Enable with --passes A,B,C,D. Pass D unifies the graphify concepts the intent layer touches non-destructively, emitting same_concept edges between existing concept nodes (nothing is merged or rewritten). Cross-document intent linkage then emerges transitively — intent —rationale_for→ JWT —same_concept— token ←rationale_for— intent. Candidate pairs are cross-file only, blocked by embedding cosine when a local model is available (optional [embeddings] extra) or lexical Jaccard otherwise, then the LLM adjudicates each pair; edges below --min-confidence are dropped. Every edge records its method (embedding/lexical) and similarity for provenance.

Idempotency

Re-running graphify-intent is safe. Intent nodes already present in the sidecar (.graphify_intent.json) are skipped by ID, and edges are skipped by (source, target, relation) key, so a re-run adds only genuinely new intent. You can also run passes separately (e.g. --passes A then --passes B,C): when A is not in the run, later passes seed from the existing sidecar.

Development

git clone <repo>
cd graphify-intent
pip install -e ".[dev]"
python -m pytest tests/ -v

Tests cover every module — IDs, section splitting with spans, relation/confidence validation, all four passes (including Pass D's candidate resolution and embedding fallback), merge/enriched-graph assembly, the report, backend resolution, and an end-to-end smoke test. The LLM boundary and the graphify-runtime probe are injected/mocked, so the suite makes no network calls and needs no API key or live backend.

graphify runtime dependency

Every backend routes through graphify's graphify.llm, so graphify-intent declares graphifyy as a runtime dependencypip install graphify-intent installs it (ADR-0010). For an API backend, add the provider extra so graphify's SDK is present:

pip install 'graphify-intent[anthropic]'   # --backend api with Anthropic
pip install 'graphify-intent[gemini]'      # --backend api with Gemini

The default subscription backend needs no provider SDK — only the Claude Code CLI (claude), installed and signed in. If the graphify runtime is somehow missing, graphify-intent exits immediately with a clear install message rather than failing deep inside a run.

Architecture decisions

Design rationale is recorded as ADRs in docs/adr/:

ADR Decision
0001 Direct, schema-enforced LLM calls
0002 Intent/rationale as the product wedge + node data model
0003 Span-level grounding & provenance
0004 Embedding-free, same-file anchoring (v1)
0005 Extend graphify's relation vocabulary deliberately
0006 Enriched-graph assembly on the node-link format
0007 Determinism, idempotency, and a single confidence policy
0008 Subscription-first LLM backend via the Claude CLI
0009 Cross-document concept resolution (Pass D)
0010 graphify (graphifyy) is a hard runtime dependency

License and attribution

graphify-intent is licensed under the MIT License — Copyright (c) 2026 Will Neill.

This project is an independent post-processor built to interoperate with graphify by Safi Shamsi. It reuses graphify's graph schema and relation vocabulary and calls graphify as a separately-installed runtime dependency; no graphify source code is bundled with or distributed as part of this project. graphify is licensed under the MIT License (Copyright (c) 2026 Safi Shamsi) — see the ACKNOWLEDGEMENT AND ATTRIBUTION section of this repository's LICENSE file and the upstream license for the full text. With thanks to the graphify project.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphify_intent-0.2.1.tar.gz (51.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

graphify_intent-0.2.1-py3-none-any.whl (26.4 kB view details)

Uploaded Python 3

File details

Details for the file graphify_intent-0.2.1.tar.gz.

File metadata

  • Download URL: graphify_intent-0.2.1.tar.gz
  • Upload date:
  • Size: 51.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for graphify_intent-0.2.1.tar.gz
Algorithm Hash digest
SHA256 ebadb3465fa3a13d888a1b95c38c5156ef5ddbcb8e003e1e62bc53b9a7bd00bb
MD5 a2694c970d7ab270b04eba235b08e320
BLAKE2b-256 4071d4a6568741b669a0650e33d2f7eb391c2d759851173b10dc658d5a6250b6

See more details on using hashes here.

Provenance

The following attestation bundles were made for graphify_intent-0.2.1.tar.gz:

Publisher: publish.yml on willneill/graphify-intent

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file graphify_intent-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for graphify_intent-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 cf84008f08bb96068c97b10a8cf6b0970a2eef8516924a97dd4d8f0ebc508140
MD5 0a00f03fdadd70ffe4c4f98114a95db9
BLAKE2b-256 cf16b6d5586ba0a848250b9d914a8effcaaad25a1d609e5903e5e5bf0ec22b42

See more details on using hashes here.

Provenance

The following attestation bundles were made for graphify_intent-0.2.1-py3-none-any.whl:

Publisher: publish.yml on willneill/graphify-intent

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page