Skip to main content

Claude Code skill - turn any folder of code, docs, papers, images, or tweets into a queryable knowledge graph

Project description

graphify

CI

A Claude Code skill. Type /graphify in Claude Code - it reads your files, builds a knowledge graph, and gives you back structure you didn't know was there.

Andrej Karpathy keeps a /raw folder where he drops papers, tweets, screenshots, and notes. graphify is the answer to that problem - 71.5x fewer tokens per query vs reading the raw files, persistent across sessions, honest about what it found vs guessed.

/graphify ./raw
graphify-out/
├── graph.html       interactive graph - click nodes, search, filter by community
├── obsidian/        open as Obsidian vault
├── GRAPH_REPORT.md  god nodes, surprising connections, suggested questions
├── graph.json       persistent graph - query weeks later without re-reading
└── cache/           SHA256 cache - re-runs only process changed files

Install

Requires: Claude Code and Python 3.10+

pip install graphifyy && graphify install

The PyPI package is temporarily named graphifyy while the graphify name is being reclaimed. The CLI and skill command are still graphify.

Then open Claude Code in any directory and type:

/graphify .
Manual install (curl)
mkdir -p ~/.claude/skills/graphify
curl -fsSL https://raw.githubusercontent.com/safishamsi/graphify/v1/skills/graphify/skill.md \
  > ~/.claude/skills/graphify/SKILL.md

Add to ~/.claude/CLAUDE.md:

- **graphify** (`~/.claude/skills/graphify/SKILL.md`) - any input to knowledge graph. Trigger: `/graphify`
When the user types `/graphify`, invoke the Skill tool with `skill: "graphify"` before doing anything else.

Usage

/graphify                          # run on current directory
/graphify ./raw                    # run on a specific folder
/graphify ./raw --mode deep        # more aggressive INFERRED edge extraction
/graphify ./raw --update           # re-extract only changed files, merge into existing graph

/graphify add https://arxiv.org/abs/1706.03762        # fetch a paper, save, update graph
/graphify add https://x.com/karpathy/status/...       # fetch a tweet

/graphify query "what connects attention to the optimizer?"
/graphify path "DigestAuth" "Response"
/graphify explain "SwinTransformer"

/graphify ./raw --svg              # export graph.svg
/graphify ./raw --graphml          # export graph.graphml (Gephi, yEd)
/graphify ./raw --neo4j            # generate cypher.txt for Neo4j
/graphify ./raw --mcp              # start MCP stdio server

Works with any mix of file types:

Type Extensions Extraction
Code .py .ts .js .go .rs .java .c .cpp .rb .cs .kt .scala .php AST via tree-sitter + call-graph pass
Docs .md .txt .rst Concepts + relationships via Claude
Papers .pdf Citation mining + concept extraction
Images .png .jpg .webp .gif Claude vision - screenshots, diagrams, any language

What you get

God nodes - highest-degree concepts (what everything connects through)

Surprising connections - ranked by composite score. Code-paper edges rank higher than code-code. Each result includes a plain-English why.

Suggested questions - 4-5 questions the graph is uniquely positioned to answer

Token benchmark - printed automatically after every run. On a mixed corpus (Karpathy repos + papers + images): 71.5x fewer tokens per query vs reading raw files.

Every edge is tagged EXTRACTED, INFERRED, or AMBIGUOUS - you always know what was found vs guessed.

Worked examples

Corpus Type Reduction Eval
Karpathy repos + 5 papers + 4 images Mixed 71.5x worked/karpathy-repos/review.md
httpx (Python HTTP client) Code small corpus¹ worked/httpx/review.md
Code + paper + Arabic image Multi-type small corpus¹ worked/mixed-corpus/review.md

¹ Small corpora fit in one context window - graph value is structural clarity, not compression.

Tech stack

NetworkX + Leiden (graspologic) + tree-sitter + Claude + vis.js. No Neo4j required, no server, runs entirely locally.

Contributing

Worked examples are the most trust-building contribution. Run /graphify on a real corpus, save output to worked/{slug}/, write an honest review.md evaluating what the graph got right and wrong, submit a PR.

Extraction bugs - open an issue with the input file, the cache entry (graphify-out/cache/), and what was missed or invented.

See ARCHITECTURE.md for module responsibilities and how to add a language.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphifyy-0.1.7.tar.gz (82.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

graphifyy-0.1.7-py3-none-any.whl (70.7 kB view details)

Uploaded Python 3

File details

Details for the file graphifyy-0.1.7.tar.gz.

File metadata

  • Download URL: graphifyy-0.1.7.tar.gz
  • Upload date:
  • Size: 82.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for graphifyy-0.1.7.tar.gz
Algorithm Hash digest
SHA256 ea57bbdf713d8b5dc86e057e7891c488a815e9d42b52043d96a271bcb3b8f095
MD5 5046b331c6784b89d4e4e3923d1bd6ec
BLAKE2b-256 6cc17e7cf9ffab452d2449370bda9fd44f215c1d47fd9809ceab88fbd8beccb4

See more details on using hashes here.

File details

Details for the file graphifyy-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: graphifyy-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 70.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for graphifyy-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 970d41cf4cff9624b6bcc549165aea442b54c201e536d14f8f74f62321f37bb7
MD5 f467b79d1edacf65e167b1a6773d96f0
BLAKE2b-256 9f92dbad4105bafcece09df918ef8a26d57855d1d04acbf162902e505db4d847

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page