megabrain

Local code-intelligence engine: one call returns all the code related to a question, explained with the real code spliced in.

These details have not been verified by PyPI

Project links

Project description

megabrain

One call returns all the code related to a question
— explained like a senior engineer, with the real code spliced in.

Python 3.10+ No LLM in the retrieval path Zero code hallucination MCP ready

megabrain is a local code-intelligence engine. It replaces minutes of file-by-file crawling — grep, read, explore-agent chains — with a single grounded answer. Index a repo once; every later question retrieves all the related code and stitches it into a walkthrough narrated by an LLM that can only point at code, never rewrite it — so nothing is hallucinated.

Install

No packaging step — runs straight from a clone:

git clone https://github.com/pinecall/megabrain.git
cd megabrain
pip install numpy                                # core (Python indexing)
pip install tree_sitter tree_sitter_typescript  # TS/JS (+ tree_sitter_ruby tree_sitter_go for Ruby/Go)
alias megabrain='python3 -m megabrain.cli'       # optional: clean invocation

Keys are read from the environment (with a ~/.zshrc fallback):

export PERPLEXITY_API_KEY=...   # required — embeddings
export ANTHROPIC_API_KEY=...    # only for `ask` and `--best`

Usage

megabrain index  ~/repo                                      # incremental (sha256), no daemon
megabrain ask    ~/repo "how does auth work end to end"      # walkthrough + real code (~6–20s)
megabrain ask    ~/repo "how do I configure X" --docs        # explain the docs instead of code
megabrain query  ~/repo "request retry logic"                # raw code map, no LLM (~200ms)
megabrain get    ~/repo src/x.py --symbol Class.method       # one file or symbol

Indexes code (.py · .ts · .tsx · .js · .jsx · .mjs · .cjs · Ruby · Go) and markdown (.md · .markdown · .mdx) through a strategy registry — adding a language or content type is a config entry, not a branch in the indexer.

How it works

A three-stage pipeline. Only ask calls an LLM — and only to narrate.

stage	what it does
index	cAST chunk → Perplexity embed (int8, L2-normalized) → SQLite. Incremental by `sha256`, no watcher.
query	No-LLM retrieval (~200ms): dense-chunk + file-skeleton fusion, with import/call-graph candidates. Returns a map — CORE (full code of the top files) + RELATED (every connected file with its best chunk).
ask	One streamed Haiku call writes the walkthrough and cites code as `[[k]]`; the engine replaces each citation with the verbatim block (real file, real line numbers). Non-cited related files are listed at the end. Fail-open: any API error falls back to the full `query` bundle.

Because the model only emits citations and the engine splices code from disk, code cannot be hallucinated or rewritten.

MCP

Use it from Claude Code or any MCP client:

claude mcp add megabrain -- python3 -m megabrain.mcp_server

Tools: megabrain_ask (primary), megabrain_query, megabrain_get, megabrain_index. The server auto-refreshes a stale index before answering, so results always match disk.

Design

Every choice below is backed by an internal golden set (30 verified queries):

decision	evidence
cAST chunking (4K nws chars, breadcrumbs, partition-guaranteed)	unit-tested; every line lands in exactly one chunk — no gaps, no overlaps
`pplx-embed-v1` (1024-d, int8 wire, L2-normalized)	beats `openai-3-large` on code; ~$0.0016/repo
dense chunk + 0.5 × file-skeleton score	dual-granularity; precision up, no downside
graph (import + call edges) for candidates only	PageRank-as-ranking rejected by data (Acc@1 0.91 → 0.73)
no LLM in the retrieval path	every LLM prune variant cost completeness; `ask` explains, it never prunes

Engine retrieval (internal golden set): R@1 0.86 · bundle_full 1.00 · p50 8 ms warm. SWE-bench Lite localization (no training): retrieval Acc@1 ≈ 0.52 / @5 ≈ 0.83 — on par with the trained CodeRankEmbed retriever.

Project layout

megabrain/   engine — chunkers, embeddings, SQLite store, graph, indexer, query, ask, cli, mcp_server
evals/       golden.json (30 verified queries) + swebench harness
tests/       engine + chunker gates

_{github.com/pinecall/megabrain}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.2

Jun 14, 2026

This version

0.1.1

Jun 14, 2026

0.1.0

Jun 14, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

megabrain-0.1.1.tar.gz (53.4 kB view details)

Uploaded Jun 14, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

megabrain-0.1.1-py3-none-any.whl (52.1 kB view details)

Uploaded Jun 14, 2026 Python 3

File details

Details for the file megabrain-0.1.1.tar.gz.

File metadata

Download URL: megabrain-0.1.1.tar.gz
Upload date: Jun 14, 2026
Size: 53.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.0

File hashes

Hashes for megabrain-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`2b4d86b5340ca23e5eaa9bd54a011727a77f70cdba0c4fc483f7229776a733a3`
MD5	`c1833bf215803c98823b2583b5acab73`
BLAKE2b-256	`c171f348607ecf4b49e75cb2cf64ea97ec657556276d9e7ae4eaa39dbee5327c`

See more details on using hashes here.

File details

Details for the file megabrain-0.1.1-py3-none-any.whl.

File metadata

Download URL: megabrain-0.1.1-py3-none-any.whl
Upload date: Jun 14, 2026
Size: 52.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.0

File hashes

Hashes for megabrain-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4c073d5698292c2eda69e71440ba451f565f34bfa9169853e3e915b0f1398075`
MD5	`499a41a600b1c18ec74c07fa965af587`
BLAKE2b-256	`186c6c28db29c6c9bcb0820c763c5acd8f826dc205253ae1d37f424f3e82d178`

See more details on using hashes here.

megabrain 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

megabrain

Install

Usage

How it works

MCP

Design

Project layout

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes