Skip to main content

Intent-layer enrichment for graphify knowledge graphs: extract the decisions, mechanisms, constraints, and trade-offs behind your docs as grounded, anchored graph nodes.

Project description

graphify-intent

PyPI version Python versions License: MIT CI Ruff

Post-process a graphify knowledge graph with an intent layer — the why behind your docs. Where graphify extracts what exists, graphify-intent extracts the decisions, mechanisms, constraints, and trade-offs your prose encodes, each with a grounded rationale, and anchors them to the concepts graphify already found. A pipeline of LLM passes — extract → anchor → cross-doc relate, plus an opt-in cross-document concept-resolution pass — produces a sidecar JSON and an enriched graph.json.

flowchart LR
    D["docs/*.md"] --> A
    G["graph.json"] --> B
    A["Pass A<br/>extract intent"] --> B["Pass B<br/>anchor to concepts"]
    B --> C["Pass C<br/>cross-doc intent"]
    B --> R["Pass D<br/>concept resolution<br/>(opt-in)"]
    A --> OUT
    C --> OUT
    R --> OUT
    OUT["outputs:<br/>.graphify_intent.json<br/>graph.enriched.json<br/>enrichment_report.md"]

Requirements

graphify-intent post-processes graphify graphs, so you need graphifyy installed in the same Python environment. By default it uses your Claude Pro/Max subscription via the Claude Code CLI — no API key required. To use a billed API key instead, see LLM backend.

Installation

Install into the same environment as graphify:

pip install graphify-intent
# or with uv:
uv pip install graphify-intent

To enable Pass D's embedding-based concept resolution, add the optional extra:

pip install graphify-intent[embeddings]   # local model2vec embedder for Pass D

Without it, Pass D still runs — it degrades gracefully to a lexical-similarity fallback.

Quick start

# Uses your Claude Pro/Max subscription by default (needs the `claude` CLI).
graphify-intent \
  --graph graphify-out/graph.json \
  --docs docs/ \
  --passes A,B,C

LLM backend

By default, graphify-intent uses your Claude Pro/Max subscription through the Claude Code CLI (claude) — no API key, no per-token cost. It falls back to a billed API key if the CLI isn't available.

--backend Behaviour
(unset) Subscription if the claude CLI is on your PATH, else a detected API key
subscription Force the Claude Code CLI (subscription auth)
api Force a billed API key (ANTHROPIC_API_KEY / GEMINI_API_KEY)
claude / gemini / … Force a specific graphify backend

You can also set GRAPHIFY_INTENT_BACKEND. For subscription mode, install the claude CLI and run it once to sign in. For API mode, install the matching extra (graphifyy[anthropic] or graphifyy[gemini]) and provide the key.

Keep API keys out of .env and your shell history

If you do use an API key, prefer a password-manager CLI over inlining the secret. With 1Password's op, inject it at runtime so it never lands on disk in plaintext:

export ANTHROPIC_API_KEY="$(op read 'op://<vault>/<item>/credential')"
graphify-intent --backend api --graph graphify-out/graph.json --docs docs/

…or wrap the command with op run so the secret lives only for that process:

op run --env-file=.env.op -- graphify-intent --backend api --graph graphify-out/graph.json --docs docs/

A real key committed in .env risks leaking into git history, CI logs, and backups; a manager keeps it encrypted, access-audited, and revocable. Best of all, the default subscription backend needs no key at all.

CLI flags

Flag Default Description
--graph graphify-out/graph.json Path to the graphify output graph.json
--docs docs/ Directory containing markdown source files
--passes A,B,C Comma-separated passes: A extract, B anchor, C cross-doc relate, D cross-doc concept resolution (opt-in)
--backend (auto) Backend selection (see LLM backend)
--min-confidence 0.75 Minimum confidence_score for Pass C intent edges
--max-concurrency 4 Maximum number of concurrent LLM requests
--max-cost-usd 2.00 Cost guard for API backends: prompts if the estimate exceeds this (skipped for the free subscription backend)
--yes false Skip the cost-guard prompt (non-interactive / CI)

Set --backend to subscription, api, or an explicit provider name (see LLM backend).

Outputs

All outputs are written to the same directory as graph.json:

File Contents
.graphify_intent.json Sidecar: all intent nodes and edges added across runs
graph.enriched.json Original graph merged with the intent layer (graphify node-link format)
enrichment_report.md Before/after metrics: node/link/isolated counts, intent breakdown by kind, anchored vs orphaned, grounding coverage

How it works

Pass A — intent extraction. Each markdown document is split into heading-bounded sections. Per section, the LLM extracts 0-N intent units — a decision, mechanism, constraint, or tradeoff — each with a one-sentence claim, the rationale (the why), any alternatives considered, and a deterministic ID grounded to the section's character span.

Pass B — anchoring. Each intent node is linked to the graph concept(s) it explains via a rationale_for edge. Candidate graph nodes are found without embeddings — same source_file plus lexical overlap — then the LLM adjudicates which the intent actually explains.

Pass C — cross-doc intent. A single LLM call over all intent nodes surfaces intent-level relationships between them — supersedes, trade_off_against, constrains, motivated_by, contradicts. Edges below --min-confidence are discarded.

Pass D — cross-doc concept resolution (opt-in). Enable with --passes A,B,C,D. Pass D unifies the graphify concepts the intent layer touches non-destructively, emitting same_concept edges between existing concept nodes (nothing is merged or rewritten). Cross-document intent linkage then emerges transitively — intent —rationale_for→ JWT —same_concept— token ←rationale_for— intent. Candidate pairs are cross-file only, blocked by embedding cosine when a local model is available (optional [embeddings] extra) or lexical Jaccard otherwise, then the LLM adjudicates each pair; edges below --min-confidence are dropped. Every edge records its method (embedding/lexical) and similarity for provenance.

Idempotency

Re-running graphify-intent is safe. Intent nodes already present in the sidecar (.graphify_intent.json) are skipped by ID, and edges are skipped by (source, target, relation) key, so a re-run adds only genuinely new intent. You can also run passes separately (e.g. --passes A then --passes B,C): when A is not in the run, later passes seed from the existing sidecar.

Development

git clone <repo>
cd graphify-intent
pip install -e ".[dev]"
python -m pytest tests/ -v

Tests cover every module — IDs, section splitting with spans, relation/confidence validation, all four passes (including Pass D's candidate resolution and embedding fallback), merge/enriched-graph assembly, the report, backend resolution, and an end-to-end smoke test. The LLM boundary is injected/mocked, so the suite needs neither graphify nor an API key.

Note on graphify peer dependency

graphify-intent does not declare graphifyy as a hard dependency because the package name and extras vary by installation method. Install it separately:

pip install graphifyy            # core — enough for the default subscription backend
pip install graphifyy[anthropic] # add for --backend api with Anthropic
pip install graphifyy[gemini]    # add for --backend api with Gemini

For the default subscription backend you also need the Claude Code CLI (claude) installed and signed in. The CLI will exit with a clear error message if graphify is not found at runtime.

Architecture decisions

Design rationale is recorded as ADRs in docs/adr/:

ADR Decision
0001 Direct, schema-enforced LLM calls
0002 Intent/rationale as the product wedge + node data model
0003 Span-level grounding & provenance
0004 Embedding-free, same-file anchoring (v1)
0005 Extend graphify's relation vocabulary deliberately
0006 Enriched-graph assembly on the node-link format
0007 Determinism, idempotency, and a single confidence policy
0008 Subscription-first LLM backend via the Claude CLI
0009 Cross-document concept resolution (Pass D)

License and attribution

graphify-intent is licensed under the MIT License — Copyright (c) 2026 Will Neill.

This project is an independent post-processor built to interoperate with graphify by Safi Shamsi. It reuses graphify's graph schema and relation vocabulary and calls graphify as a separately-installed runtime dependency; no graphify source code is bundled with or distributed as part of this project. graphify is licensed under the MIT License (Copyright (c) 2026 Safi Shamsi) — see the ACKNOWLEDGEMENT AND ATTRIBUTION section of this repository's LICENSE file and the upstream license for the full text. With thanks to the graphify project.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphify_intent-0.2.0.tar.gz (47.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

graphify_intent-0.2.0-py3-none-any.whl (25.1 kB view details)

Uploaded Python 3

File details

Details for the file graphify_intent-0.2.0.tar.gz.

File metadata

  • Download URL: graphify_intent-0.2.0.tar.gz
  • Upload date:
  • Size: 47.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for graphify_intent-0.2.0.tar.gz
Algorithm Hash digest
SHA256 f3a77e66d611ac1977dca5283e9a7084276fbe556e4c25888a2cbc2935811485
MD5 eeb7414fa905f9bace3caca1f36fd693
BLAKE2b-256 8617d8581d67329e63dfc194597cfd21fd255b2181e79dbc3193206b3f7c7326

See more details on using hashes here.

Provenance

The following attestation bundles were made for graphify_intent-0.2.0.tar.gz:

Publisher: publish.yml on willneill/graphify-intent

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file graphify_intent-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for graphify_intent-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3bad2ca774ebfe13e42394963ea35c746aa6f4e4d932f741355d4ad483bf8a3f
MD5 11b24f891a99ad3b2d58b3c18f3c19fc
BLAKE2b-256 afea3932429dec492c45e8ada8043820d120b32ff16c7da536ad1ea90141387f

See more details on using hashes here.

Provenance

The following attestation bundles were made for graphify_intent-0.2.0-py3-none-any.whl:

Publisher: publish.yml on willneill/graphify-intent

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page