Intent-layer enrichment for graphify knowledge graphs: extract the decisions, mechanisms, constraints, and trade-offs behind your docs as grounded, anchored graph nodes.
Project description
graphify-intent
Post-process a graphify knowledge graph with an intent layer — the why behind your docs. Where graphify extracts what exists, graphify-intent extracts the decisions, mechanisms, constraints, and trade-offs your prose encodes, each with a grounded rationale, and anchors them to the concepts graphify already found. A pipeline of LLM passes — extract → anchor → cross-doc relate, plus an opt-in cross-document concept-resolution pass — produces a sidecar JSON and an enriched graph.json.
flowchart LR
D["docs/*.md"] --> A
G["graph.json"] --> B
A["Pass A<br/>extract intent"] --> B["Pass B<br/>anchor to concepts"]
B --> C["Pass C<br/>cross-doc intent"]
B --> R["Pass D<br/>concept resolution<br/>(opt-in)"]
A --> OUT
C --> OUT
R --> OUT
OUT["outputs:<br/>.graphify_intent.json<br/>graph.enriched.json<br/>enrichment_report.md"]
Requirements
graphify-intent post-processes graphify graphs, so you need graphifyy installed in the same Python environment. By default it uses your Claude Pro/Max subscription via the Claude Code CLI — no API key required. To use a billed API key instead, see LLM backend.
Installation
Install into the same environment as graphify:
pip install graphify-intent
# or with uv:
uv pip install graphify-intent
To enable Pass D's embedding-based concept resolution, add the optional extra:
pip install graphify-intent[embeddings] # local model2vec embedder for Pass D
Without it, Pass D still runs — it degrades gracefully to a lexical-similarity fallback.
Quick start
# Uses your Claude Pro/Max subscription by default (needs the `claude` CLI).
graphify-intent \
--graph graphify-out/graph.json \
--docs docs/ \
--passes A,B,C
LLM backend
By default, graphify-intent uses your Claude Pro/Max subscription through the
Claude Code CLI (claude) — no API key,
no per-token cost. It falls back to a billed API key if the CLI isn't available.
--backend |
Behaviour |
|---|---|
| (unset) | Subscription if the claude CLI is on your PATH, else a detected API key |
subscription |
Force the Claude Code CLI (subscription auth) |
api |
Force a billed API key (ANTHROPIC_API_KEY / GEMINI_API_KEY) |
claude / gemini / … |
Force a specific graphify backend |
You can also set GRAPHIFY_INTENT_BACKEND. For subscription mode, install the claude
CLI and run it once to sign in. For API mode, install the matching extra
(graphifyy[anthropic] or graphifyy[gemini]) and provide the key.
Keep API keys out of .env and your shell history
If you do use an API key, prefer a password-manager CLI over inlining the secret. With
1Password's op, inject it at runtime so it
never lands on disk in plaintext:
export ANTHROPIC_API_KEY="$(op read 'op://<vault>/<item>/credential')"
graphify-intent --backend api --graph graphify-out/graph.json --docs docs/
…or wrap the command with op run so the secret lives only for that process:
op run --env-file=.env.op -- graphify-intent --backend api --graph graphify-out/graph.json --docs docs/
A real key committed in .env risks leaking into git history, CI logs, and backups; a
manager keeps it encrypted, access-audited, and revocable. Best of all, the default
subscription backend needs no key at all.
CLI flags
| Flag | Default | Description |
|---|---|---|
--graph |
graphify-out/graph.json |
Path to the graphify output graph.json |
--docs |
docs/ |
Directory containing markdown source files |
--passes |
A,B,C |
Comma-separated passes: A extract, B anchor, C cross-doc relate, D cross-doc concept resolution (opt-in) |
--backend |
(auto) | Backend selection (see LLM backend) |
--min-confidence |
0.75 |
Minimum confidence_score for Pass C intent edges |
--max-concurrency |
4 |
Maximum number of concurrent LLM requests |
--max-cost-usd |
2.00 |
Cost guard for API backends: prompts if the estimate exceeds this (skipped for the free subscription backend) |
--yes |
false |
Skip the cost-guard prompt (non-interactive / CI) |
Set --backend to subscription, api, or an explicit provider name (see LLM backend).
Outputs
All outputs are written to the same directory as graph.json:
| File | Contents |
|---|---|
.graphify_intent.json |
Sidecar: all intent nodes and edges added across runs |
graph.enriched.json |
Original graph merged with the intent layer (graphify node-link format) |
enrichment_report.md |
Before/after metrics: node/link/isolated counts, intent breakdown by kind, anchored vs orphaned, grounding coverage |
How it works
Pass A — intent extraction. Each markdown document is split into heading-bounded sections. Per section, the LLM extracts 0-N intent units — a decision, mechanism, constraint, or tradeoff — each with a one-sentence claim, the rationale (the why), any alternatives considered, and a deterministic ID grounded to the section's character span.
Pass B — anchoring. Each intent node is linked to the graph concept(s) it explains via a rationale_for edge. Candidate graph nodes are found without embeddings — same source_file plus lexical overlap — then the LLM adjudicates which the intent actually explains.
Pass C — cross-doc intent. A single LLM call over all intent nodes surfaces intent-level relationships between them — supersedes, trade_off_against, constrains, motivated_by, contradicts. Edges below --min-confidence are discarded.
Pass D — cross-doc concept resolution (opt-in). Enable with --passes A,B,C,D.
Pass D unifies the graphify concepts the intent layer touches non-destructively,
emitting same_concept edges between existing concept nodes (nothing is merged or
rewritten). Cross-document intent linkage then emerges transitively —
intent —rationale_for→ JWT —same_concept— token ←rationale_for— intent. Candidate
pairs are cross-file only, blocked by embedding cosine when a local model is
available (optional [embeddings] extra) or lexical Jaccard otherwise, then the LLM
adjudicates each pair; edges below --min-confidence are dropped. Every edge records its
method (embedding/lexical) and similarity for provenance.
Idempotency
Re-running graphify-intent is safe. Intent nodes already present in the sidecar (.graphify_intent.json) are skipped by ID, and edges are skipped by (source, target, relation) key, so a re-run adds only genuinely new intent. You can also run passes separately (e.g. --passes A then --passes B,C): when A is not in the run, later passes seed from the existing sidecar.
Development
git clone <repo>
cd graphify-intent
pip install -e ".[dev]"
python -m pytest tests/ -v
Tests cover every module — IDs, section splitting with spans, relation/confidence validation, all four passes (including Pass D's candidate resolution and embedding fallback), merge/enriched-graph assembly, the report, backend resolution, and an end-to-end smoke test. The LLM boundary is injected/mocked, so the suite needs neither graphify nor an API key.
Note on graphify peer dependency
graphify-intent does not declare graphifyy as a hard dependency because the package name and extras vary by installation method. Install it separately:
pip install graphifyy # core — enough for the default subscription backend
pip install graphifyy[anthropic] # add for --backend api with Anthropic
pip install graphifyy[gemini] # add for --backend api with Gemini
For the default subscription backend you also need the Claude Code CLI (claude) installed and signed in. The CLI will exit with a clear error message if graphify is not found at runtime.
Architecture decisions
Design rationale is recorded as ADRs in docs/adr/:
| ADR | Decision |
|---|---|
| 0001 | Direct, schema-enforced LLM calls |
| 0002 | Intent/rationale as the product wedge + node data model |
| 0003 | Span-level grounding & provenance |
| 0004 | Embedding-free, same-file anchoring (v1) |
| 0005 | Extend graphify's relation vocabulary deliberately |
| 0006 | Enriched-graph assembly on the node-link format |
| 0007 | Determinism, idempotency, and a single confidence policy |
| 0008 | Subscription-first LLM backend via the Claude CLI |
| 0009 | Cross-document concept resolution (Pass D) |
License and attribution
graphify-intent is licensed under the MIT License — Copyright (c) 2026 Will Neill.
This project is an independent post-processor built to interoperate with
graphify by Safi Shamsi. It reuses graphify's
graph schema and relation vocabulary and calls graphify as a separately-installed runtime
dependency; no graphify source code is bundled with or distributed as part of this project.
graphify is licensed under the MIT License (Copyright (c) 2026 Safi Shamsi) — see the
ACKNOWLEDGEMENT AND ATTRIBUTION section of this repository's LICENSE file and
the upstream license for the
full text. With thanks to the graphify project.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file graphify_intent-0.2.0.tar.gz.
File metadata
- Download URL: graphify_intent-0.2.0.tar.gz
- Upload date:
- Size: 47.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f3a77e66d611ac1977dca5283e9a7084276fbe556e4c25888a2cbc2935811485
|
|
| MD5 |
eeb7414fa905f9bace3caca1f36fd693
|
|
| BLAKE2b-256 |
8617d8581d67329e63dfc194597cfd21fd255b2181e79dbc3193206b3f7c7326
|
Provenance
The following attestation bundles were made for graphify_intent-0.2.0.tar.gz:
Publisher:
publish.yml on willneill/graphify-intent
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
graphify_intent-0.2.0.tar.gz -
Subject digest:
f3a77e66d611ac1977dca5283e9a7084276fbe556e4c25888a2cbc2935811485 - Sigstore transparency entry: 2057687647
- Sigstore integration time:
-
Permalink:
willneill/graphify-intent@97f19494d62151e85ff13597d0e98f19d8c3c1d0 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/willneill
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@97f19494d62151e85ff13597d0e98f19d8c3c1d0 -
Trigger Event:
release
-
Statement type:
File details
Details for the file graphify_intent-0.2.0-py3-none-any.whl.
File metadata
- Download URL: graphify_intent-0.2.0-py3-none-any.whl
- Upload date:
- Size: 25.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3bad2ca774ebfe13e42394963ea35c746aa6f4e4d932f741355d4ad483bf8a3f
|
|
| MD5 |
11b24f891a99ad3b2d58b3c18f3c19fc
|
|
| BLAKE2b-256 |
afea3932429dec492c45e8ada8043820d120b32ff16c7da536ad1ea90141387f
|
Provenance
The following attestation bundles were made for graphify_intent-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on willneill/graphify-intent
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
graphify_intent-0.2.0-py3-none-any.whl -
Subject digest:
3bad2ca774ebfe13e42394963ea35c746aa6f4e4d932f741355d4ad483bf8a3f - Sigstore transparency entry: 2057688038
- Sigstore integration time:
-
Permalink:
willneill/graphify-intent@97f19494d62151e85ff13597d0e98f19d8c3c1d0 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/willneill
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@97f19494d62151e85ff13597d0e98f19d8c3c1d0 -
Trigger Event:
release
-
Statement type: