Skip to main content

Polyglot code-analysis framework — parse Python, TypeScript, Go and Rust into a shared graph IR with type-aware resolution

Project description

graphlens

Extensible polyglot code analysis framework that parses source projects, normalizes their structure into a shared graph IR, and exposes it for dependency analysis, navigation, and code intelligence tooling.

PyPI Python License CI codecov

Documentation · Repository · Issues


Architecture

Repository → Language Adapter → GraphLens (IR) → Graph Backend
Layer Responsibility
Language Adapter Parses source files, produces GraphLens
GraphLens Typed nodes + directed relations (the IR)
Graph Backend Persists or queries the graph (Neo4j, in-memory, …)

Adapters are pure data producers — they never write to any backend. The graph is the only output.

Why graph IR?

  • Language-agnostic — one shared model for Python, TypeScript, Rust, …
  • Plugin-based adapters — each language is a separate package, registered via Python entry points
  • Tree-sitter powered — all adapters use tree-sitter for CST parsing and exact span positions, combined with type-aware resolution (ty for Python, TypeScript Compiler API for TypeScript, gopls for Go, rust-analyzer for Rust)
  • Cross-language aware — adapters emit language-agnostic BOUNDARY ports (HTTP, queues, gRPC, Temporal); graphlens-link connects a consumer in one language to a provider in another
  • Monorepo awarecan_handle() and find_*_roots() handle multi-language repos correctly
  • Deterministic node IDs — SHA-256 hash of project::kind::qualified_name → stable across re-scans

Benchmarks

Analysis throughput on large real-world projects, refreshed automatically on every release — one cold run per project inside the published Docker image (so the numbers reflect exactly the toolchain users get). See benchmarks/ to reproduce locally or add a project.

Last run: 2026-06-21 12:45 UTC · image latest · runner Linux x86_64 · single cold run, indicative only.

Project Lang Commit LOC Files Nodes Relations Time Peak RSS KLOC/s Resolver Resolved
apache/superset python c83fb2b 399 519 1 886 156 268 379 813 136.5s 2,051 MB 2.9 ok 84% of 281 667 (68s)
colinhacks/zod typescript 1fb56a5 74 194 404 8 741 25 258 18.3s 646 MB 4.1 ok 91% of 15 771 (14s)
gin-gonic/gin go 73726dc 23 672 98 7 227 11 882 13.2s 2,102 MB 1.8 ok 100% of 8 920 (12s)
casdoor/casdoor go 696bcf0 86 898 458 14 987 28 276 129.3s 14,576 MB 0.7 ok 100% of 19 421 (126s)
gohugoio/hugo go 4d22555 224 821 897 34 809 72 225 106.8s 9,468 MB 2.1 ok 99% of 49 013 (100s)
BurntSushi/ripgrep rust 4649aa9 50 275 98 7 691 10 001 150.9s 2,354 MB 0.3 ok 54% of 11 435 (146s)
tokio-rs/axum rust c59208c 43 653 296 8 422 9 673 984.7s 8,888 MB 0.0 ok 35% of 9 662 (893s)
astral-sh/ruff rust 6686f63 687 409 1 870 52 047 62 214 34.8s 1,505 MB 19.8 ok 0% of 155 276 (1s)
Total 1 590 441 290 192 1574.5s 1.0 61% of 551 165

Peak RSS measured via cgroup.v2 (whole process tree, incl. LSP resolver subprocesses). KLOC/s = analysed thousands-of-lines per second. Generated by benchmarks/run_benchmarks.py.

Documentation

Full product documentation lives at https://Neko1313.github.io/graphlens/ (built with Docusaurus from website/):

  • Getting Started — install, quick start, core concepts
  • Guides — library API, CLI, querying, visualization, Neo4j, cross-language, MCP
  • CI Integration — strict mode, GitHub Actions, Docker, local hooks
  • Adapters — Python, TypeScript, Go, Rust, and writing your own
  • Graph Model — nodes, relations, boundaries, serialization
  • API Reference — exact signatures

To run the docs locally: cd website && pnpm install && pnpm start.

Installation

# Core library only (models, contracts, registry)
pip install graphlens

# Core + Python adapter
pip install "graphlens[python]"

# Core + TypeScript adapter
pip install "graphlens[typescript]"

# Core + Go / Rust adapters
pip install "graphlens[go]"
pip install "graphlens[rust]"

# CLI (graphlens analyze / visualize / query / neo4j)
pip install "graphlens-cli[python]"          # with Python adapter
pip install "graphlens-cli[all]"             # Python + TS + Go + Rust + Neo4j

With uv:

uv add graphlens
uv add "graphlens[python]"
uv add "graphlens[typescript]"
uv add "graphlens-cli[all]"

Docker (all adapters + toolchains pre-installed)

For CI, the published image bundles the CLI with every adapter and the toolchains their resolvers drive (ty, Node, Go + gopls, Rust + rust-analyzer) — no local setup required, and the supported way to get the Go and Rust adapters (which are not published to PyPI). Mount your project at /workspace:

docker run --rm -v "$PWD:/workspace" ghcr.io/neko1313/graphlens \
    analyze /workspace --output /workspace/graph.json

The image is published to the GitHub Container Registry on each release (:latest plus :X.Y.Z / :X.Y version tags).

Quick start

from pathlib import Path
from graphlens import adapter_registry

# Load and instantiate the Python adapter
adapter = adapter_registry.load("python")()

# Analyze a project — returns a GraphLens
graph = adapter.analyze(Path("./my-project"))

print(f"Nodes:     {len(graph.nodes)}")
print(f"Relations: {len(graph.relations)}")

# Inspect nodes by kind
from graphlens import NodeKind

modules = [n for n in graph.nodes.values() if n.kind == NodeKind.MODULE]
classes = [n for n in graph.nodes.values() if n.kind == NodeKind.CLASS]

# Check the resolver actually ran (don't trust a silently degraded graph)
from graphlens import RESOLVER_STATUS_KEY
assert graph.metadata[RESOLVER_STATUS_KEY] == "ok"

# Query the graph (indexed lookups, no manual scanning)
fn = next(n for n in graph.nodes.values() if n.name == "my_function")
callers = graph.callers(fn.id)          # who calls it
callees = graph.callees(fn.id)          # what it calls
near = graph.neighbors(fn.id, depth=2)  # 2-hop neighbourhood

# Serialize for pipelines / agents (round-trippable JSON), then reload
text = graph.to_json(indent=2)
graph2 = type(graph).from_json(text)

# Diff two scans (e.g. before/after a change)
diff = old_graph.diff(graph)
print(diff.added_nodes, diff.removed_relations, diff.is_empty)

CLI (graphlens-cli)

Install graphlens-cli to get the graphlens entry point:

# Print node/relation statistics
graphlens analyze <project_root>
graphlens analyze ~/myrepo --lang python,typescript,go,rust

# Serialize the graph to JSON (CI indexing step); --strict fails on a
# degraded resolver so a pipeline never feeds agents an incomplete graph
graphlens analyze ~/myrepo --output graph.json
graphlens analyze ~/myrepo --format json
graphlens analyze ~/myrepo --strict

# Query a saved graph (callers | callees | references | neighbors)
graphlens query my_function --graph graph.json --op callers
graphlens query MyClass.method --graph graph.json --op neighbors --depth 2

# Interactive HTML graph viewer (opens in browser)
graphlens visualize <project_root>
graphlens visualize ~/myrepo --lang python --show-external --max-nodes 500
graphlens visualize . --output graph.html --no-open

# Export to Neo4j
graphlens neo4j <project_root> --uri bolt://localhost:7687 --user neo4j --password secret
graphlens neo4j . --wipe --batch-size 200

# Serve the graph to agents over the Model Context Protocol (needs the
# optional `mcp` extra: pip install "graphlens-cli[mcp]")
graphlens mcp --graph graph.json

mcp — Model Context Protocol server

Exposes a saved graph to LLM agents as MCP tools: graph_stats, find_nodes, callers, callees, references, neighbors, boundaries, and communicates_with. Install with the mcp extra and point it at a JSON graph produced by graphlens analyze --output.

visualize — interactive HTML graph viewer

Produces a self-contained HTML file powered by vis.js and opens it in the browser.

Flag Description
--lang auto|python|typescript|python,typescript Adapters to use (default: auto-detect all)
--show-external Include stdlib / third-party external symbol nodes
--show-structure Add CONTAINS / DECLARES structural edges
--max-nodes N Prune low-degree nodes above N (default: 1500)
--output PATH Write HTML to PATH instead of graph-<name>.html
--no-open Do not open the browser automatically

Click behaviour — click any node to see its info panel. For FUNCTION and METHOD nodes the panel has a "Show callers" button that switches the graph into focus mode: only the selected node and every node that calls or references it are shown, with the caller list in the sidebar. Click empty space or ← Back to return to the full graph.

neo4j — export to Neo4j

Uses UNWIND … MERGE Cypher (no APOC required). Every node gets a :Code label plus a kind-specific label (:Function, :ExternalSymbol, …). Relations are created grouped by type. Install the optional neo4j extra:

pip install "graphlens-cli[neo4j]"

Graph model

Node kinds

Kind Description
PROJECT Root project node
MODULE Python/TS/… module (directory or file)
FILE Source file
CLASS Class declaration
FUNCTION Top-level function
METHOD Method inside a class
PARAMETER Function/method parameter
VARIABLE Module-level or local variable
ATTRIBUTE Class attribute
TYPE_ALIAS Type alias declaration
IMPORT Import statement
DEPENDENCY Declared package dependency
EXTERNAL_SYMBOL External symbol (stdlib, third-party, or unknown); carries metadata["origin"]
BOUNDARY Cross-language interface port (HTTP route, queue topic, gRPC method, Temporal activity); shared id collapses matching server/client across languages

Relation kinds

Kind Description
CONTAINS Structural containment (project → module → file → class)
DECLARES Declaration (file declares function, class declares method)
IMPORTS Import edge (file → import node)
RESOLVES_TO Import resolved to a module or external symbol
CALLS Function/method call (resolved to declaration node)
REFERENCES Value reference (variable/attribute used as a value)
INHERITS_FROM Class inheritance (resolved to declaration node)
HAS_TYPE Type annotation/inference edge (function/param/variable → class or external)
DEPENDS_ON Package dependency
EXPOSES A server/provider exposes a BOUNDARY (e.g. an HTTP route handler)
CONSUMES A client/consumer consumes a BOUNDARY (e.g. an HTTP call)
COMMUNICATES_WITH Consumer → provider, added by graphlens-link from matching EXPOSES/CONSUMES

Cross-language boundaries

Adapters emit BOUNDARY ports for the interfaces a service exposes or consumes — HTTP/REST routes and clients, message-queue topics, gRPC methods, and Temporal activities. Each port has a language-agnostic id (make_boundary_id(mechanism, key)), so a Python FastAPI route and a TypeScript fetch call to the same path collapse onto one BOUNDARY node when their graphs are merged. The graphlens-link package then pairs CONSUMES with EXPOSES into COMMUNICATES_WITH edges:

from graphlens_link import link_graph

merged = python_graph.merge(ts_graph, allow_shared=True)
result = link_graph(merged)          # adds COMMUNICATES_WITH edges

See examples/demo_cross_language.py for a Python-server ↔ TypeScript-client walkthrough.

Adapter plugin system

Language adapters register themselves via Python entry points — no changes to the core needed:

# packages/graphlens-python/pyproject.toml
[project.entry-points."graphlens.adapters"]
python = "graphlens_python:PythonAdapter"

The registry discovers installed adapters automatically at runtime:

from graphlens import adapter_registry

adapter_registry.available()          # ["python", ...]
adapter_cls = adapter_registry.load("python")
adapter = adapter_cls()

Adapters can also be registered manually (useful for testing):

adapter_registry.register("python", MyPythonAdapter)

Implementing an adapter

Subclass LanguageAdapter and implement four methods:

from pathlib import Path
from graphlens import GraphLens, LanguageAdapter

class MyLangAdapter(LanguageAdapter):
    def language(self) -> str:
        return "mylang"

    def file_extensions(self) -> set[str]:
        return {".ml", ".mli"}

    def can_handle(self, project_root: Path) -> bool:
        return (project_root / "dune-project").exists()

    def analyze(
        self, project_root: Path, files: list[Path] | None = None
    ) -> GraphLens:
        graph = GraphLens()
        files = files or self.collect_files(project_root)
        # ... parse and populate graph ...
        return graph

Register in pyproject.toml and the core registry finds it automatically.

Project structure

graphlens/                      ← uv workspace root (core library)
  src/graphlens/                ← models, contracts, registry, exceptions, utils
  packages/
    graphlens-python/           ← Python adapter (tree-sitter + ty)
    graphlens-typescript/       ← TypeScript adapter (tree-sitter + Compiler API)
    graphlens-go/               ← Go adapter (tree-sitter + gopls)
    graphlens-rust/             ← Rust adapter (tree-sitter + rust-analyzer)
    graphlens-link/             ← cross-language linker (COMMUNICATES_WITH)
    graphlens-cli/              ← CLI (typer): analyze, query, visualize, neo4j, mcp
  tests/                         ← core tests (100% coverage)
  examples/                      ← standalone usage examples

Development

Requires Python 3.13+, uv, task.

task install        # uv sync --all-groups
task lint           # ruff + ty + bandit for all packages
task tests          # all tests with coverage

Individual package tasks:

task core:lint           task core:test
task python:lint         task python:test
task typescript:lint     task typescript:test
task cli:lint            task cli:test

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphlens-0.6.0.tar.gz (24.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

graphlens-0.6.0-py3-none-any.whl (33.1 kB view details)

Uploaded Python 3

File details

Details for the file graphlens-0.6.0.tar.gz.

File metadata

  • Download URL: graphlens-0.6.0.tar.gz
  • Upload date:
  • Size: 24.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.23 {"installer":{"name":"uv","version":"0.11.23","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for graphlens-0.6.0.tar.gz
Algorithm Hash digest
SHA256 a9ef3d7f80ac829416dbf1f3ab300dfc8a7637a5e7f4dc70ebcff0fcdf5616da
MD5 bef7aee19b981b34f60cb3226236c746
BLAKE2b-256 b10fa6fa77b3e0010b50c8a0fc5c3a8779dd6c61961bb3ff78d241c4e35eb34c

See more details on using hashes here.

File details

Details for the file graphlens-0.6.0-py3-none-any.whl.

File metadata

  • Download URL: graphlens-0.6.0-py3-none-any.whl
  • Upload date:
  • Size: 33.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.23 {"installer":{"name":"uv","version":"0.11.23","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for graphlens-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 69c91ac141f6efb2e200882f6cc79ce7870f040f3f8df6c666c1403b3913cf05
MD5 7875efd14cd513389affcaa61dbe66eb
BLAKE2b-256 1ec70539eb739fc64f3c61cbc294ea959c0b0a01cc8698b569205ea8a5b184cc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page