Local-first code graph builder with 5-signal hybrid search for AI coding agents

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

algodesigner

These details have not been verified by PyPI

Project description

codeloom

"With codeloom, your coding agent knows what to read."
Quick Start · 한국어 · 日本語 · 中文 · Deutsch

Python 3.10+

codeloom visualization

Why codeloom?

raw data from a given number of sources is collected, then compiled by an LLM into a .md wiki, then operated on by various CLIs by the LLM to do Q&A and to incrementally enhance the wiki - Andrej Karpathy

codeloom builds a queryable code graph and knowledge base from codebases with 10,000+ files and knowledge documents, powered by lightweight local LLM models. Hybrid vector + keyword search with subgraph response (vector + keyword → RRF fusion with MST subgraph) lets coding agents truly understand your entire project, not just search keywords. Install it, and Claude Code sees the full picture — no extra tokens, no extra commands, everything runs 100% locally.

Quick Start

pip install codeloom

cd your-project/
codeloom opencode install    # for OpenCode
# or: codeloom claude install  # for Claude Code

Then tell Claude Code or OpenCode:

"Build a code graph for this project"

That's it. Your agent will build the graph, and from then on, consult it before every search. The graph auto-rebuilds when your session ends.

AI Agent Integrations

codeloom integrates with major AI coding agents in one command:

Agent	Install	What it does
Claude Code	`codeloom claude install`	Skill + CLAUDE.md + PreToolUse hook
OpenCode	`codeloom opencode install`	Skill in `.opencode/skills/`
Codex CLI	`codeloom codex install`	AGENTS.md + PreToolUse hook
Gemini CLI	`codeloom gemini install`	GEMINI.md + BeforeTool hook
Cursor IDE	`codeloom cursor install`	`.cursor/rules/` rule file
Windsurf IDE	`codeloom windsurf install`	`.windsurf/rules/` rule file
Cline	`codeloom cline install`	`.clinerules` file
Aider CLI	`codeloom aider install`	CONVENTIONS.md + `.aider.conf.yml`
MCP Server	`claude mcp add codeloom -- codeloom mcp`	5 tools over Model Context Protocol

Each install does two things: writes a context file with rules, and (where supported) registers a hook that fires before tool calls. To remove: codeloom <platform> uninstall.

Supported Languages

Structural Extraction (20+ languages)

codeloom extracts functions, classes, methods, calls, imports, and inheritance from source code using tree-sitter and native parsers.


Python	JavaScript	TypeScript	Go
Rust	Java	C	C++
C#	Ruby	Swift	Scala
Lua	PHP	Elixir	Kotlin
Objective-C	Terraform/HCL

Also extracts structure from config and document formats: YAML, JSON, TOML, Markdown, PDF, HTML, CSV, Shell, R, and more.

Multilingual Natural Language

Text nodes (docs, comments, markdown) are embedded with intfloat/multilingual-e5-small supporting 100+ natural languages — Korean, Japanese, Chinese, German, French, and more. Search in your language, find results in any language.

Features

Auto-Rebuild

When integrated with AI coding agents (Claude Code, Codex, etc.), codeloom automatically rebuilds the graph when code changes. The Stop/SessionEnd hook detects modified files via git diff and triggers an incremental rebuild in the background — zero manual intervention.

Git-Accelerated Deltas

Uses git diff to bypass heavy I/O of full filesystem scans. Only changed files are processed, enabling sub-second updates for typical logic changes. Toggle with --git.

Smart Ignore & Pruning

codeloom respects ignore patterns from three sources and explicitly prunes deleted files, ensuring no "ghost nodes" remain in your graph after files are removed or renamed.

Source	Description
Built-in	`.git`, `node_modules`, `__pycache__`, `dist`, `build`, etc.
`.gitignore`	Auto-read from project root — your existing git ignores just work
`.codeloom-ignore`	Project-specific overrides for the code graph

Incremental Builds

SHA-256 content hashing per file and hot-start PageRank. Only changed files are re-extracted and re-embedded, while previous importance scores are reused for rapid convergence — typically 95%+ faster than a full rebuild.

Memory Management

4GB memory budget with stage-wise release. The pipeline generates → stores → frees at each stage: extraction results are freed after graph build, embeddings are streamed in batches and freed after DB write, and the full graph is released after persistence. GC triggers proactively at 75% threshold.

100% Local

No cloud services, no API keys, no telemetry. SQLite + FAISS for storage, sentence-transformers for embeddings. All data stays on your machine.

Hybrid Search with Subgraph Response

Every query returns seed nodes and a subgraph showing how they connect:

Search Pipeline

Signal	What it finds
Vector Search	Semantically similar code and documents (dual-model: code + text)
Keyword Search	Exact name matches via FTS5 (BM25)

Results are fused via Weighted Reciprocal Rank Fusion (RRF), then connected through MST-based shortest paths to reveal how seed nodes relate.

Smart Test Demotion: By default, test files are penalised in ranking (0.3× score multiplier) so that source-code results surface first. The heuristic detects test files across 8+ language conventions (Python test_*.py, Java *Test.java, JS *.test.ts, Go *_test.go, Rust *_test.rs, C# *Test.cs, Ruby *_spec.rb, and more) plus directory patterns (test/, tests/, spec/, src/test/). When results mix source and test files, a hint reports the split. Disable with --include-tests.

Context Snippets: The top 3 seed results include an inline snippet of source code (up to 5 lines) to help you immediately decide whether a result is relevant — no separate Read call needed. Configure with --snippets N (default 3, 0 to disable).

Response Format

seeds:
codeloom/core/pipeline.py:71
  │ def run_pipeline(source_dir: Path, ...) -> PipelineResult:
  │     """Run the full code graph build pipeline."""
  │     source_dir = Path(source_dir).resolve()
storage/store.py:20
  │ class KnowledgeStore:

edges:
codeloom/core/pipeline.py:71 -calls-> storage/store.py:20
codeloom/core/pipeline.py:0 -co_change-> storage/store.py:0
codeloom/core/pipeline.py:0 -defines-> codeloom/core/pipeline.py:71

seeds: Node IDs (file:line) found by search, with optional source snippets
edges: Subgraph connecting seeds through shortest paths (intermediate nodes appear in edges)

CLI Reference

All commands output compact text by default (designed for AI agent consumption).

Command	Description
`build <dir>`	Build code graph (`--incremental`, `--git`)
`watch <dir>`	Real-time file system monitor for instant graph sync
`impact <id>`	Analyze "blast radius" — find all downstream dependents
`dependencies <id>`	Analyze upstream dependencies for a given symbol
`setup`	One-step automated setup for all detected agents
`search <query>`	Hybrid vector + keyword search with subgraph and snippets (`--top-k`, `--fast`, `--kind`, `--file`, `--include-tests`, `--snippets`)
`search-vector <query>`	Vector similarity only (code + text dual model)
`search-keyword <query>`	FTS5 keyword matching only (BM25 ranking)
`query`	Interactive search REPL
`communities`	List and search communities (`--search`, `--level`)
`stats`	Graph statistics
`node <id>`	Node details with fuzzy matching
`export`	Export as JSON, GraphML, or D3.js
`visualize`	Interactive HTML visualization
`clean`	Remove .codeloom/ database
`doctor`	Check installation health
`mcp`	Start MCP server (stdio)
`claude install\|uninstall`	Manage Claude Code integration
`codex install\|uninstall`	Manage Codex CLI integration
`gemini install\|uninstall`	Manage Gemini CLI integration
`cursor install\|uninstall`	Manage Cursor IDE integration
`windsurf install\|uninstall`	Manage Windsurf IDE integration
`cline install\|uninstall`	Manage Cline integration
`aider install\|uninstall`	Manage Aider CLI integration
`opencode install\|uninstall`	Manage OpenCode integration

Performance

Benchmarks on codeloom's own codebase (~3,500 lines, 90 files, 1,300 nodes):

Operation	Time
Full build	~14s
Incremental (changes)	~4s
Incremental (no changes)	~0.4s
Cold search (dual model)	~2.8s
Cold search (`--fast`)	~0.2s
Warm search	~0.08s
Cached search	<1ms

Embedding models: ~180MB, downloaded once to ~/.codeloom/models/
Database: ~2MB (SQLite + FTS5 + FAISS indices)
Incremental builds: SHA-256 hashing, 95%+ faster than full rebuild

Requirements

Python 3.10+
~180MB disk for embedding models (cached on first use)

# Optional: PDF extraction
pip install codeloom[docs]

Development

pip install -e ".[dev]"
pytest
ruff check codeloom/

License

MIT License. See LICENSE for details.

Contributing

Contributions are welcome! See CONTRIBUTING.md for guidelines.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

algodesigner

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.10

Jun 10, 2026

0.1.8

Jun 8, 2026

0.1.7

Jun 7, 2026

0.1.6

Jun 6, 2026

0.1.5

Jun 6, 2026

0.1.4

May 28, 2026

This version

0.1.3

May 28, 2026

0.1.2

May 27, 2026

0.1.1

May 19, 2026

0.1.0

May 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

codeloom-0.1.3.tar.gz (555.4 kB view details)

Uploaded May 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

codeloom-0.1.3-py3-none-any.whl (204.4 kB view details)

Uploaded May 28, 2026 Python 3

File details

Details for the file codeloom-0.1.3.tar.gz.

File metadata

Download URL: codeloom-0.1.3.tar.gz
Upload date: May 28, 2026
Size: 555.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for codeloom-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`a3f2a07f1f235716a2d2d12869b69cddad23625111e5e68ac8ae38df9d1f897a`
MD5	`03a793591d056b773e418a94f377c59e`
BLAKE2b-256	`0f8df034301b70cfb0cfc603a658f9a3307e574539ae8296aebed866bb043385`

See more details on using hashes here.

Provenance

The following attestation bundles were made for codeloom-0.1.3.tar.gz:

Publisher: release.yml on algodesigner/codeloom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: codeloom-0.1.3.tar.gz
- Subject digest: a3f2a07f1f235716a2d2d12869b69cddad23625111e5e68ac8ae38df9d1f897a
- Sigstore transparency entry: 1650116798
- Sigstore integration time: May 28, 2026
Source repository:
- Permalink: algodesigner/codeloom@a0e9a90c8845c6ba63fbaa58a6bb1fe06a645014
- Branch / Tag: refs/heads/main
- Owner: https://github.com/algodesigner
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@a0e9a90c8845c6ba63fbaa58a6bb1fe06a645014
- Trigger Event: workflow_dispatch

File details

Details for the file codeloom-0.1.3-py3-none-any.whl.

File metadata

Download URL: codeloom-0.1.3-py3-none-any.whl
Upload date: May 28, 2026
Size: 204.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for codeloom-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7a49cda4375d1f83940ae4457b658fe3b8702085f27de1eb837a021f2683db15`
MD5	`a86e2ae2c260785d0b658e9469b70dcc`
BLAKE2b-256	`ac634bf523ce1072ec192df506bd8b8bb8e697fc84cc95e551bdf229afe58b34`

See more details on using hashes here.

Provenance

The following attestation bundles were made for codeloom-0.1.3-py3-none-any.whl:

Publisher: release.yml on algodesigner/codeloom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: codeloom-0.1.3-py3-none-any.whl
- Subject digest: 7a49cda4375d1f83940ae4457b658fe3b8702085f27de1eb837a021f2683db15
- Sigstore transparency entry: 1650117004
- Sigstore integration time: May 28, 2026
Source repository:
- Permalink: algodesigner/codeloom@a0e9a90c8845c6ba63fbaa58a6bb1fe06a645014
- Branch / Tag: refs/heads/main
- Owner: https://github.com/algodesigner
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@a0e9a90c8845c6ba63fbaa58a6bb1fe06a645014
- Trigger Event: workflow_dispatch

codeloom 0.1.3

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

codeloom

Why codeloom?

Quick Start

AI Agent Integrations

Supported Languages

Structural Extraction (20+ languages)

Multilingual Natural Language

Features

Auto-Rebuild

Git-Accelerated Deltas

Smart Ignore & Pruning

Incremental Builds

Memory Management

100% Local

Hybrid Search with Subgraph Response

CLI Reference

Performance

Requirements

Development

License

Contributing

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance