Skip to main content

Local-first code graph builder with hybrid vector + graph search

Project description

hegwid-cg

hedwig-cg

"With hedwig-cg, your coding agent knows what to read."
Quick Start · 한국어 · 日本語 · 中文 · Deutsch

CI PyPI License Python 3.10+


Why hedwig-cg?

hedwig-cg builds a unified code graph from your code, docs, and dependencies — built to handle enterprise codebases with 10,000+ files. 5-signal hybrid search (vector + graph + keyword + community → RRF fusion) lets coding agents truly understand your entire project, not just search keywords. Install it, and Claude Code sees the full picture — no extra tokens, no extra commands, everything runs 100% locally.

Quick Start

pip install hedwig-cg
hedwig-cg claude install

Then tell Claude Code:

"Build a code graph for this project"

That's it. Claude Code will build the graph, and from then on, consult it before every search. The graph auto-rebuilds when your session ends.

AI Agent Integrations

hedwig-cg integrates with major AI coding agents in one command:

Agent Install What it does
Claude Code hedwig-cg claude install Skill + CLAUDE.md + PreToolUse hook
Codex CLI hedwig-cg codex install AGENTS.md + PreToolUse hook
Gemini CLI hedwig-cg gemini install GEMINI.md + BeforeTool hook
Cursor IDE hedwig-cg cursor install .cursor/rules/ rule file
Windsurf IDE hedwig-cg windsurf install .windsurf/rules/ rule file
Cline hedwig-cg cline install .clinerules file
Aider CLI hedwig-cg aider install CONVENTIONS.md + .aider.conf.yml
MCP Server claude mcp add hedwig-cg -- hedwig-cg mcp 5 tools over Model Context Protocol

Each install does two things: writes a context file with rules, and (where supported) registers a hook that fires before tool calls. To remove: hedwig-cg <platform> uninstall.

Supported Languages

Deep AST Extraction (17 languages)

hedwig-cg uses tree-sitter tags.scm for universal structural extraction — functions, classes, methods, calls, imports, inheritance — without per-language custom code.

Python JavaScript TypeScript Go
Rust Java C C++
C# Ruby Swift Scala
Lua PHP Elixir Kotlin
Objective-C

Additionally detects and indexes: Markdown, PDF, HTML, CSV, YAML, JSON, TOML, Shell, R, and more.

Multilingual Natural Language

Text nodes (docs, comments, markdown) are embedded with intfloat/multilingual-e5-small supporting 100+ natural languages — Korean, Japanese, Chinese, German, French, and more. Search in your language, find results in any language.


Features

Auto-Rebuild

When integrated with AI coding agents (Claude Code, Codex, etc.), hedwig-cg automatically rebuilds the graph when code changes. The Stop/SessionEnd hook detects modified files via git diff and triggers an incremental rebuild in the background — zero manual intervention.

Smart Ignore

hedwig-cg respects ignore patterns from three sources, all using full gitignore spec (negation !, ** globs, directory-only patterns):

Source Description
Built-in .git, node_modules, __pycache__, dist, build, etc.
.gitignore Auto-read from project root — your existing git ignores just work
.hedwig-cg-ignore Project-specific overrides for the code graph

Incremental Builds

SHA-256 content hashing per file. Only changed files are re-extracted and re-embedded. Unchanged files are merged from the existing graph — typically 95%+ faster than a full rebuild.

Memory Management

4GB memory budget with stage-wise release. The pipeline generates → stores → frees at each stage: extraction results are freed after graph build, embeddings are streamed in batches and freed after DB write, and the full graph is released after persistence. GC triggers proactively at 75% threshold.

100% Local

No cloud services, no API keys, no telemetry. SQLite + FAISS for storage, sentence-transformers for embeddings. All data stays on your machine.


5-Signal Hybrid Search

Every query runs through five signals fused via Reciprocal Rank Fusion (RRF):

Signal What it finds
Code Vector Semantically similar code
Text Vector Docs and comments in 100+ languages
Graph Expansion Structurally connected nodes (callers, imports)
Full-Text Search Exact keyword matches (BM25)
Community Context Related nodes from the same cluster

CLI Reference

All commands output compact JSON by default (designed for AI agent consumption).

Command Description
build <dir> Build code graph (--incremental, --no-embed)
search <query> 5-signal hybrid search (--top-k, --fast, --expand)
query Interactive search REPL
communities List and search communities (--search, --level)
stats Graph statistics
node <id> Node details with fuzzy matching
export Export as JSON, GraphML, or D3.js
visualize Interactive HTML visualization
clean Remove .hedwig-cg/ database
doctor Check installation health
mcp Start MCP server (stdio)

Performance

Benchmarks on hedwig-cg's own codebase (~3,500 lines, 90 files, 1,300 nodes):

Operation Time
Full build ~14s
Incremental (changes) ~4s
Incremental (no changes) ~0.4s
Cold search (dual model) ~2.8s
Cold search (--fast) ~0.2s
Warm search ~0.08s
Cached search <1ms
  • Embedding models: ~470MB, downloaded once to ~/.hedwig-cg/models/
  • Database: ~2MB (SQLite + FTS5 + FAISS indices)
  • Incremental builds: SHA-256 hashing, 95%+ faster than full rebuild

Requirements

  • Python 3.10+
  • ~470MB disk for embedding models (cached on first use)
# Optional: PDF extraction
pip install hedwig-cg[docs]

Development

pip install -e ".[dev]"
pytest
ruff check hedwig_cg/

License

MIT License. See LICENSE for details.

Contributing

Contributions are welcome! See CONTRIBUTING.md for guidelines.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hedwig_cg-0.11.5.tar.gz (196.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hedwig_cg-0.11.5-py3-none-any.whl (171.7 kB view details)

Uploaded Python 3

File details

Details for the file hedwig_cg-0.11.5.tar.gz.

File metadata

  • Download URL: hedwig_cg-0.11.5.tar.gz
  • Upload date:
  • Size: 196.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for hedwig_cg-0.11.5.tar.gz
Algorithm Hash digest
SHA256 76502c4f37a5e9121dc9a8b009d6a7d5debd0ff31914f5a6201fdc495dfbc7a9
MD5 3b5a759554d89fa49f3f1c6d55e771f0
BLAKE2b-256 e64c5dc2dbef6eb3147ca3a00bfc4698c1fe1c6f4522129346e504e2e6627eef

See more details on using hashes here.

File details

Details for the file hedwig_cg-0.11.5-py3-none-any.whl.

File metadata

  • Download URL: hedwig_cg-0.11.5-py3-none-any.whl
  • Upload date:
  • Size: 171.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for hedwig_cg-0.11.5-py3-none-any.whl
Algorithm Hash digest
SHA256 2723005ed655dd090cb284a33b8d092ed4871e38b1088e1490997b9fb7f739ee
MD5 ea8c46cc3af7021ad60dd46bc2c17b04
BLAKE2b-256 dd77e45876d2b04e04d7310443d0247cde91a85b976fd269b98b974f1b091cda

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page