Knowledge graph plugin for Terraform, Kubernetes, GitHub Actions, and Docker Compose infrastructure files
Project description
infra-graph
Stop asking your AI to read 70 files. Give it a graph.
infra-graph is a knowledge graph engine for infrastructure files. It parses your Terraform, Kubernetes, ArgoCD, GitHub Actions, Docker Compose, Helm, and Kustomize files, builds a structural dependency graph, and exposes it as an MCP server — so your AI assistant reads compact graph context instead of raw files on every question.
Why infra-graph?
Every time you ask your AI assistant an infrastructure question, it reads your entire repo from scratch.
- "What does this EC2 instance depend on?" → AI reads all 80
.tffiles - "Which ConfigMap does this Deployment use?" → AI reads every manifest
- "What breaks if I change this ArgoCD AppProject?" → AI scans everything again
The cross-file relationships that matter — a Security Group referenced by 12 resources, a ConfigMap mounted by 5 Deployments, an ArgoCD ApplicationSet deploying 9 services to 3 clusters — are invisible without a graph.
infra-graph pre-indexes those relationships once. Every subsequent question reads the compact graph.
| Approach | Tokens per query |
|---|---|
| AI reads all files (naive) | ~29,600–71,000 |
get_minimal_context |
~300 |
get_blast_radius (targeted) |
~500–800 |
| Full graph (worst case) | ~1,100 |
Up to 65× token reduction on targeted queries.
Quick Start
Prerequisites: Python 3.10+ · pip · An AI assistant with MCP support (Claude Code, Cursor, Codex, or OpenCode)
Step 1 — Install
pip install infra-graph7
The PyPI package is
infra-graph7. Once installed, the CLI command isinfra-graph(no7).
Step 2 — Go to your infrastructure repo
cd /path/to/your/infra-repo
Step 3 — Wire it into your AI assistant
infra-graph install
This auto-detects your AI assistant (Claude Code, Cursor, Codex, OpenCode) and writes the MCP config. Done — restart your AI assistant and it will use infra-graph automatically.
Step 4 — Build the graph
infra-graph build .
You'll see a GRAPH_REPORT.md appear in the current directory with a summary of your infrastructure: god nodes, communities, surprising connections, and token savings.
Step 5 — Ask questions
Open your AI assistant and ask:
What is the blast radius if I delete the production VPC?
Which Deployments use the app-config ConfigMap?
Show me the full architecture overview.
What secrets does the external-secrets operator manage?
Which services depend on this database?
The AI now reads compact graph context (~500 tokens) instead of all your files (~30,000 tokens).
Installation Details
Claude Code
infra-graph install --platform claude-code
This writes:
.mcp.json— MCP server config (Claude Code picks this up automatically on next launch)CLAUDE.md— instructs Claude to use infra-graph tools before reading files
Then restart Claude Code. You'll see infra-graph listed under available MCP servers.
You can also use the /infra-graph slash command:
/infra-graph . # build + get orientation summary
/infra-graph . --update # incremental update after file changes
Cursor
infra-graph install --platform cursor
Writes .cursor/rules/infra-graph.mdc. Restart Cursor to pick it up.
Codex
infra-graph install --platform codex
Writes AGENTS.md with tool usage instructions.
OpenCode
infra-graph install --platform opencode
Manual / other assistants
infra-graph serve # starts the MCP stdio server
Point your assistant's MCP config at this command. The server speaks the standard MCP stdio protocol.
Building the graph
infra-graph build . # parse everything in current directory
infra-graph build ./terraform # only Terraform files
infra-graph build ./k8s # only Kubernetes manifests
infra-graph build . --update # re-parse only files that changed (fast)
infra-graph build . --watch # auto-rebuild on every file save
infra-graph build . --mode deep # add optional LLM semantic annotations
After each build, GRAPH_REPORT.md is written with:
- God nodes — highest-degree resources everything connects through
- Communities — automatically detected resource clusters (networking, compute, secrets, CI/CD)
- Surprising edges — cross-community connections worth reviewing
- Token savings — naive token cost vs. graph query cost for your repo
Ignoring files
Create .infraignore in your repo root (same syntax as .gitignore):
.terraform/
*.tfstate
*.tfstate.backup
dist/
node_modules/
What gets parsed
| Format | Extensions | What gets extracted |
|---|---|---|
| Terraform / HCL | .tf .hcl |
Resources, modules, variables, outputs, locals, data sources, providers, ${} interpolations, depends_on |
| Kubernetes | .yaml .yml with apiVersion |
Deployments, Services, ConfigMaps, Secrets, Ingresses, StatefulSets, DaemonSets, HPAs, PVCs, ServiceAccounts + label→selector edges |
| ArgoCD | .yaml with argoproj.io |
AppProjects, Applications, ApplicationSets, cluster generators, member_of + deploys_to edges |
| cert-manager | .yaml with cert-manager.io |
ClusterIssuers, Issuers, Certificates + uses_issuer, creates_secret edges |
| External Secrets | .yaml with external-secrets.io |
ExternalSecrets, ClusterSecretStores + uses_store edges |
| GitHub Actions | .yml in .github/workflows/ |
Jobs, steps, uses: action refs, needs: deps, secret usage |
| Docker Compose | docker-compose.yml / compose.yaml |
Services, volumes, networks, depends_on |
| Helm | Chart.yaml + values*.yaml |
Chart metadata, value file override edges |
| Helm templates | templates/*.yaml |
Go {{}} directives auto-stripped; static structure extracted cleanly |
| Kustomize | kustomization.yaml |
Base/overlay extends and patches edges |
MCP Tools
Once installed, your AI assistant calls these tools automatically. You can also ask it to call them explicitly.
| Tool | What it does |
|---|---|
get_minimal_context |
~300-token orientation: god nodes, community count, totals. Start here. |
get_blast_radius |
Every resource affected by a change, with depth and edge type |
query_graph |
BFS/DFS traversal from any node in any direction |
get_resource_context |
Full detail on one resource: all edges, community, file, line number |
get_architecture_overview |
Community map with dominant types and coupling warnings |
detect_changes |
Risk-scored impact analysis for a git diff |
find_hub_nodes |
Top N highest-degree (most connected) resources |
get_knowledge_gaps |
Orphaned resources, ambiguous edges, unresolved references |
build_or_update_graph |
Trigger a rebuild or incremental update from within the AI |
search_resources |
Keyword search across node IDs, names, types, and labels |
Node ID format
Use these IDs when calling tools directly:
| Resource type | Node ID format | Example |
|---|---|---|
| Terraform resource | resource.<type>.<name> |
resource.aws_vpc.main |
| Terraform variable | variable.<name> |
variable.region |
| Terraform module | module.<name> |
module.vpc |
| Kubernetes workload | <Kind>/<namespace>/<name> |
Deployment/default/api |
| ArgoCD AppProject | AppProject/<namespace>/<name> |
AppProject/argocd/my-project |
| ArgoCD Application | Application/<namespace>/<name> |
Application/argocd/frontend |
| Compose service | service/<project>/<name> |
service/myapp/postgres |
| GitHub Actions job | job/<workflow>/<job_key> |
job/ci/build |
CLI Reference
# Build the graph
infra-graph build . # full build
infra-graph build . --update # incremental (only changed files)
infra-graph build . --mode deep # with optional LLM annotation
infra-graph build . --watch # auto-rebuild on file saves
# Query from the terminal
infra-graph query "what does aws_instance.web depend on?"
infra-graph blast-radius resource.aws_vpc.main
infra-graph path "Deployment/default/api" "ConfigMap/default/app-config"
# Inspect
infra-graph status # node / edge / community counts
infra-graph visualize # open interactive vis.js graph in browser
# Server
infra-graph serve # start MCP stdio server manually
# Install
infra-graph install # auto-detect AI assistant
infra-graph install --platform claude-code
infra-graph install --platform cursor
infra-graph install --platform codex
infra-graph install --platform opencode
Benchmarks
| Corpus | Files | Naive tokens/query | Graph tokens/query | Reduction |
|---|---|---|---|---|
| AWS three-tier Terraform | 38 .tf |
~31,000 | ~520 | ~60× |
| Kubernetes GitOps repo | 120 manifests | ~48,000 | ~980 | ~49× |
| Mixed monorepo (TF + k8s + Actions) | 160 | ~71,000 | ~1,100 | ~65× |
| ArgoCD GitOps repo | 70 YAML | ~29,600 | ~650 | ~46× |
| Small single-service Compose | 4 files | ~1,200 | ~950 | ~1.3× |
Small repo note: For repos under ~20 files, graph overhead can exceed raw file size. infra-graph pays off at scale — when questions span multiple files and change frequently.
Reproduce with infra-graph eval --all.
How it works
Pass 1 — Structural parse (no LLM)
Terraform files are parsed with python-hcl2. YAML files are parsed with ruamel.yaml. Helm templates have Go {{}} directives stripped before parsing. Every resource, module, variable, workload, ArgoCD app, cert, and workflow job becomes a typed node. Every interpolation, dependency, and selector reference becomes a typed edge.
Pass 2 — Schema-aware inference (no LLM)
Kubernetes label-selector matching runs as a cross-file sweep: a label inverted index is built, then Service selectors are matched against Deployment labels to create routes_to edges. ArgoCD cluster generator matchLabels are matched against cluster Secrets. Helm and Kustomize overlay relationships are detected as extends/patches edges.
Pass 3 — Optional LLM annotation (--mode deep)
Claude annotates communities with human-readable names, extracts design rationale from comments, and enriches report summaries. Not required for token savings — the structural graph alone delivers the reduction numbers above.
Architecture
infra_graph/
├── parsers/
│ ├── tf_parser.py # python-hcl2 + ${} interpolation extractor
│ ├── yaml_parser.py # ruamel.yaml + Helm template pre-processor
│ ├── k8s_schema.py # K8s + ArgoCD + cert-manager + ESO schemas
│ ├── actions_schema.py # GitHub Actions job/step/uses graph
│ ├── compose_schema.py # Docker Compose service graph
│ └── helm_schema.py # Helm Chart.yaml + Kustomize overlay detection
├── graph/
│ ├── builder.py # NetworkX DiGraph + SHA-256 file cache
│ ├── blast_radius.py # BFS impact traversal
│ ├── community.py # Leiden clustering (graspologic) + fallback
│ └── report.py # GRAPH_REPORT.md generator
├── mcp/
│ ├── server.py # MCP stdio server
│ └── tools.py # 10 MCP tool implementations
├── install/
│ ├── claude.py # .mcp.json + CLAUDE.md writer
│ ├── cursor.py # .cursor/rules/infra-graph.mdc
│ └── codex.py # AGENTS.md writer
├── viz/
│ └── html_report.py # vis.js interactive HTML graph
└── cli.py # click CLI
Privacy: All parsing happens locally. No file contents leave your machine except during the optional --mode deep LLM pass, which uses your own API key. No telemetry. No cloud.
Contributing
git clone https://github.com/vparab7/infra-graph
cd infra-graph
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
pytest
See CONTRIBUTING.md for guidelines on adding new parsers and schemas.
License
Apache 2.0 — see LICENSE.
Built by Vedang Parab
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file infra_graph7-0.2.0.tar.gz.
File metadata
- Download URL: infra_graph7-0.2.0.tar.gz
- Upload date:
- Size: 63.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c30f0086bd624c082567ac981869bf00a17e45bd9b904820f1f73d175bf3f39f
|
|
| MD5 |
d11c9dd3a7bcb297131bd94bd471c762
|
|
| BLAKE2b-256 |
f31bf1ee040c49194bc6e92c7f8c6ccf1c43434cc7e4dbffd7b529b249fc0715
|
Provenance
The following attestation bundles were made for infra_graph7-0.2.0.tar.gz:
Publisher:
publish.yml on vparab7/infra-graph
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
infra_graph7-0.2.0.tar.gz -
Subject digest:
c30f0086bd624c082567ac981869bf00a17e45bd9b904820f1f73d175bf3f39f - Sigstore transparency entry: 1373966819
- Sigstore integration time:
-
Permalink:
vparab7/infra-graph@f3a506f1115e0670a36bd1333654e3c5d84468de -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/vparab7
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f3a506f1115e0670a36bd1333654e3c5d84468de -
Trigger Event:
push
-
Statement type:
File details
Details for the file infra_graph7-0.2.0-py3-none-any.whl.
File metadata
- Download URL: infra_graph7-0.2.0-py3-none-any.whl
- Upload date:
- Size: 57.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aead355cf478586c8f45e55e2c0705588b60ab5c6aa4fcbcd209c2a50deeb9d4
|
|
| MD5 |
158dd5ce6d8bd3d2c9cbc1921b2baa77
|
|
| BLAKE2b-256 |
8b7d54dfb6c81fbc047868d9625e79f04711a89fc294f56e769f7144bd163109
|
Provenance
The following attestation bundles were made for infra_graph7-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on vparab7/infra-graph
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
infra_graph7-0.2.0-py3-none-any.whl -
Subject digest:
aead355cf478586c8f45e55e2c0705588b60ab5c6aa4fcbcd209c2a50deeb9d4 - Sigstore transparency entry: 1373966898
- Sigstore integration time:
-
Permalink:
vparab7/infra-graph@f3a506f1115e0670a36bd1333654e3c5d84468de -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/vparab7
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f3a506f1115e0670a36bd1333654e3c5d84468de -
Trigger Event:
push
-
Statement type: