Skip to main content

Parse any codebase, build a knowledge graph, visualise it, and query it in natural language.

Project description

graphmy

Parse any codebase, build a knowledge graph, visualise it, and query it in natural language.

PyPI version CI License: MIT Python 3.10+

graphmy turns any source code directory into an interactive, queryable knowledge graph. Point it at a codebase and get:

  • A navigable graph — every function, class, method, and file as a node; every call, import, and inheritance as a typed edge
  • A self-contained HTML visualisation — open in any browser, share as a single file, no server required
  • Natural language queries — "what calls authenticate?", "find functions related to payment processing"
  • Multi-language — Python, JavaScript, TypeScript, Go, Rust, and Java in one tool

Install

pip install graphmy

For the --serve live UI:

pip install graphmy[serve]

For LLM-synthesized query answers (requires OpenAI key):

pip install graphmy[openai]

Quick start

# 1. Index a codebase (builds graph + vector index in .graphmy/)
graphmy index ./my-project

# 2. Query in natural language
graphmy query ./my-project "what calls authenticate?"
graphmy query ./my-project "find functions related to payments"

# 3. Generate interactive HTML visualisation (self-contained, ~2MB+)
graphmy viz ./my-project
open output.html

# 4. Or boot a live server with NL query bar
graphmy viz ./my-project --serve

# 5. Inspect codebase stats
graphmy info ./my-project

How it works

┌─────────────────────────────────────────────────────┐
│  PARSER  (tree-sitter, per-language grammars)        │
│  • functions, classes, methods, imports, inheritance │
│  • works on Python, JS/TS, Go, Rust, Java           │
├─────────────────────────────────────────────────────┤
│  GRAPH   (networkx DiGraph)                          │
│  • nodes: File, Class, Function, Method, …           │
│  • edges: CALLS, IMPORTS, DEFINES, CONTAINS,         │
│           INHERITS, IMPLEMENTS                       │
│  • persisted as JSON in .graphmy/                    │
├─────────────────────────────────────────────────────┤
│  SEARCH  (sentence-transformers + chromadb)          │
│  • embeds every symbol (name + signature + docstring)│
│  • incremental upsert — only re-embeds changed files │
│  • structural queries bypass embeddings entirely     │
├─────────────────────────────────────────────────────┤
│  VISUALISER  (cytoscape.js + dagre layout)           │
│  • self-contained HTML (no server needed)            │
│  • click any node → name, file:line, source preview, │
│    callers, callees                                  │
│  • --serve boots FastAPI + NL query bar              │
└─────────────────────────────────────────────────────┘

CLI reference

graphmy index   <path> [--exclude GLOB] [--fresh]
graphmy query   <path> <query> [--limit N] [--explain]
graphmy viz     <path> [--out FILE] [--serve] [--host H] [--port P]
                       [--max-body-lines N]
graphmy info    <path>
graphmy config  <path>
graphmy --version
graphmy --help

graphmy index

Parse the codebase and build the graph + vector index. Stores everything in .graphmy/ at the project root. Subsequent runs are incremental — only changed files are re-parsed.

graphmy index ./my-project
graphmy index ./my-project --exclude "tests/**" --exclude "**/*.min.js"
graphmy index ./my-project --fresh   # ignore cache, full re-index

graphmy query

Search the codebase in natural language.

graphmy query ./my-project "what calls validate_user?"
graphmy query ./my-project "find authentication functions" --limit 10
graphmy query ./my-project "explain the payment flow" --explain  # requires openai extra

graphmy viz

Generate an interactive visualisation.

# Self-contained HTML file (default)
graphmy viz ./my-project
graphmy viz ./my-project --out my-graph.html
graphmy viz ./my-project --max-body-lines 50   # cap inlined source for large repos

# Live server with NL query bar
graphmy viz ./my-project --serve
graphmy viz ./my-project --serve --host 0.0.0.0 --port 8080

Configuration

Create .graphmy/config.toml in your project root, or use environment variables:

# .graphmy/config.toml

# OpenAI integration (enables --explain in queries and the Explain button in --serve UI)
openai_api_key = "sk-..."
openai_model   = "gpt-4o-mini"   # default

# Paths to exclude from indexing (glob patterns, relative to project root)
exclude = ["tests/**", "docs/**", "**/*.min.js", "**/node_modules/**"]

# Maximum source lines inlined per symbol in static HTML (0 = unlimited)
max_body_lines = 0

Environment variables take precedence over config file:

export GRAPHMY_OPENAI_API_KEY="sk-..."
export GRAPHMY_OPENAI_MODEL="gpt-4o"

Python API

graphmy can also be used programmatically:

from graphmy import GraphmyIndex, GraphmyConfig

# Index a codebase
config = GraphmyConfig(exclude=["tests/**"])
index = GraphmyIndex("./my-project", config=config)
index.build()   # incremental by default

# Structural query — exact graph traversal
results = index.query_structural("authenticate")
for r in results:
    print(r.name, r.file, r.line)

# Natural language query
results = index.query_nl("what handles user authentication?")
for r in results:
    print(r.symbol.name, r.score, r.relationships)

# Export visualisation
index.export_html("graph.html")

Supported languages

Language Extensions Extracts
Python .py functions, classes, methods, imports, inheritance, decorators, async
JavaScript .js .mjs .cjs functions, classes, methods, ESM imports, require()
TypeScript .ts .tsx all JS + interfaces, type aliases, enums, implements
Go .go functions, methods on types, structs, interfaces, imports
Rust .rs functions, structs, enums, traits, impl blocks, use statements
Java .java classes, interfaces, methods, constructors, extends, implements

External symbols (calls/imports to dependencies outside the project root) appear as stub nodes — visible in the graph but not expanded. This keeps your graph focused on your code.


Index cache (.graphmy/)

<project-root>/
└── .graphmy/
    ├── config.toml          # optional user config
    ├── graph.json           # full graph (networkx node-link JSON)
    ├── file_hashes.json     # {filepath: [mtime, sha256]} for incremental re-index
    └── vectors/             # chromadb embedded vector store (SQLite + HNSW)

Add .graphmy/ to your .gitignore (graphmy does this automatically on first index).


Contributing

See CONTRIBUTING.md. Issues and PRs are welcome.


License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphmy-0.1.0.tar.gz (295.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

graphmy-0.1.0-py3-none-any.whl (77.5 kB view details)

Uploaded Python 3

File details

Details for the file graphmy-0.1.0.tar.gz.

File metadata

  • Download URL: graphmy-0.1.0.tar.gz
  • Upload date:
  • Size: 295.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.10 {"installer":{"name":"uv","version":"0.10.10","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for graphmy-0.1.0.tar.gz
Algorithm Hash digest
SHA256 a958cca70683ead274840d3b9206520044459c46b9bd5818dc932fef4b2bbd1d
MD5 9cf6d8458cff2394482e07841580e1ae
BLAKE2b-256 4de27c5095bda916c80ce7e2f143e8fcd06ae101946c2197a1956e99a4d66908

See more details on using hashes here.

File details

Details for the file graphmy-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: graphmy-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 77.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.10 {"installer":{"name":"uv","version":"0.10.10","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for graphmy-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cf2d235579fdbd004da05b317a28822e6f0777f89f42e58c20016d8a55d0dc2a
MD5 8cc82a450c1f76b51e602c71e78caa22
BLAKE2b-256 3f9750edf0895100a13442b0a15a70863b5cf25a83fb30a13e63ff6b5437359b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page