Self-hosted codebase intelligence platform — graph + vector indexing with MCP tools for IDE-embedded LLMs

These details have not been verified by PyPI

Project links

Project description

Ripple

Code intelligence from git history — not static analysis.

Works inside Claude Code · Cursor · GitHub Copilot Agent · OpenHands · Windsurf

Static analysis tells you what could break. Git history tells you what actually breaks together. The gap between those two is where production incidents live — and where Ripple operates.

Try it in 60 seconds — no GPU, no config

Analyze any public repo's blast radius from its commit history:

git clone https://github.com/Amitshukla2308/Index-the-code
cd Index-the-code
pip install -e .
python3 apps/cli/demo.py https://github.com/your-org/your-repo
open hr-demo-report.html

You get an HTML report of the highest-risk files ranked by co-change history. On Flask: src/flask/app.py scores 1120 — the single most-coupled file across 2000 commits. No surprise. Immediately useful.

What it does

Blast Radius — +322% recall over static import graph

Every other tool counts import edges. Ripple counts how often files actually changed together across your git history. The difference: only 14.9% of import neighbors ever co-change — the static graph predicts risk for files that don't need review. Temporal signals catch the 85% the graph misses.

Metric	Static (v1)	Temporal (v2)	Delta
recall@10	0.11	0.47	+322%
MRR	0.08	0.36	+359%

Guard — static semantic checks at 2.4ms/file

AI-generated code passes every review gate because it looks correct. Guard verifies what it claims: checks that comments match the code that follows, that locks aren't released before promised mutations complete, that auth happens before action. Catches the class of bugs where the AI wrote a plausible lie.

# Run on any Python/Haskell/Rust/Go/JS codebase
python3 -m ripple.guard path/to/changed_file.py

Patterns: lock scope, premature release, transaction boundaries, auth-before-action, error swallowing.

15 MCP Tools — plug into any AI coding assistant

One config block gives your entire team's AI assistants access to your codebase's history:

{
  "mcpServers": {
    "ripple": {
      "type": "sse",
      "url": "http://127.0.0.1:8002/sse"
    }
  }
}

Setup guides: Cursor · GitHub Copilot · OpenHands

Tool	What it answers
`check_my_changes`	Full PR verdict: blast radius + Guard + risk score + reviewers
`get_blast_radius`	Which files co-change with these? Tiered by confidence.
`get_why_context`	WHY is this code the way it is? Ownership, activity trend, Granger causal direction, anti-patterns.
`predict_missing_changes`	What files are likely missing from this PR?
`score_change_risk`	Composite 0-100 risk score for a changeset
`suggest_reviewers`	Who owns these modules from git history?
`check_criticality`	How critical is this module? (blast + coupling + recency)
`get_guardrails`	What must stay true when touching this module?
`list_critical_modules`	Top-N highest-risk modules in the codebase
`fast_search`	Zero-GPU BM25 keyword search, ~40ms p50. No embed server needed.
`search_symbols`	Semantic + keyword + co-change fusion search
`search_modules`	Find which namespace contains relevant code
`get_module`	All symbols in a module
`get_function_body`	Source code of a function by ID
`trace_callers`	Who calls this? (upstream impact)
`trace_callees`	What does this call? (downstream deps)
`get_context`	Large context block — last resort

Full setup

Prerequisites

python3 --version   # 3.11+
pip install chainlit openai lancedb sentence-transformers networkx \
            pyarrow leidenalg igraph rank-bm25 mcp ijson pyyaml \
            tree-sitter tree-sitter-haskell tree-sitter-rust

Step 1 — Prepare workspace

mkdir -p ~/projects/workspaces/YOUR_ORG/{source,artifacts,output}
cp path/to/your/repos ~/projects/workspaces/YOUR_ORG/source/
cp config.example.yaml ~/projects/workspaces/YOUR_ORG/config.yaml

Step 2 — Choose embedding provider

# Local GPU (no API cost)
EMBED_MODEL=/path/to/model python3 serve/embed_server.py

# Cloud (any, no GPU needed)
EMBED_PROVIDER=openai  OPENAI_API_KEY=sk-...  python3 serve/embed_server.py
EMBED_PROVIDER=voyage  VOYAGE_API_KEY=...     python3 serve/embed_server.py
EMBED_PROVIDER=cohere  COHERE_API_KEY=...     python3 serve/embed_server.py
EMBED_PROVIDER=jina    JINA_API_KEY=...       python3 serve/embed_server.py
EMBED_PROVIDER=ollama  EMBED_PROVIDER_MODEL=nomic-embed-text  python3 serve/embed_server.py

Step 3 — Build the index

export REPO_ROOT=~/projects/workspaces/YOUR_ORG/source
export OUTPUT_DIR=~/projects/workspaces/YOUR_ORG/output
export ARTIFACT_DIR=~/projects/workspaces/YOUR_ORG/artifacts

bash build/run_pipeline.sh   # 30 min – 2 h depending on codebase size

Step 4 — Start the servers

python3 serve/embed_server.py   # start first — other servers share it
ARTIFACT_DIR=~/projects/workspaces/YOUR_ORG/artifacts python3 serve/mcp_server.py

Add .mcp.json to your project and your AI assistant has all 15 tools.

Language support

Language	Symbols	Call graph	Guard	Co-change
Python	✓	✓	✓	✓
Haskell	✓	✓ (approx)	✓	✓
Rust	✓	✓	✓	✓
JavaScript/TypeScript	✓	✓	✓	✓
Go	✓	✓	✓	✓
Groovy	✓	—	✓	✓
Java	✓	✓	—	✓

Guardian Mode — CI/CD

# PR completeness analysis from the command line
git diff main...HEAD --name-only | python3 serve/pr_analyzer.py

# Zero-config guardian on any repo (no GPU, no AST)
python3 apps/cli/guardian_init.py --repo /path/to/repo

Copy .github/workflows/guardian-lite.yml to any repo for automatic PR risk scoring.

Architecture

embed_server.py (:8001)   — loads embedding model once; all servers connect to it
mcp_server.py   (:8002)   — 28 MCP tools over SSE
demo_server_v6.py (:8000) — Chainlit chat UI (optional)
retrieval_engine.py       — core: all indexes, all retrieval logic, imported by everything

Indexes built once, loaded at startup: symbol graph, vector store (LanceDB), co-change, cross-repo co-change, Granger causality, activity metrics, ownership, guardrail docs.

TurboQuant: optional 7.7x vector compression at 3-bit (312MB vs 1.5GB), recall@10 preserved at 0.91. Set QUANT_BITS=3 at build time to deploy on laptops.

Troubleshooting

Symptom	Fix
Embed server shows `device=cpu`	Check `nvidia-smi` and CUDA
Semantic search misses domain terms	Add short acronyms to `kw_allowlist` in config
MCP tools missing in IDE	Verify `type: sse` in `.mcp.json`, check port 8002
LanceDB write fails on WSL2	Write to `/home/`, not `/mnt/d/` (ext4 only)
Co-change builder OOM	Use `06_build_cochange.py` — streams at O(1) memory

Research

Ripple's temporal signals thesis was validated on a 94K-symbol, 12-service production codebase (113,916 commits):

Cross-repo co-change: signal is real (p < 10⁻¹³), orthogonal to import graph (0.54% overlap), 1.91× weight when import edge present
Change prediction model: activity features dominate (79–84% importance); structural features add near-zero at short horizons (K=3)
Cross-ecosystem: activity dominance holds on Flask (Python) as on Haskell — 84% activity importance

Full artifacts: ~/lab/experiments/ · Active threads: ~/lab/OPEN_QUESTIONS.md

Self-hosted. Your code never leaves your machines.

Built by Amit Shukla · Research by Carlsbert — an autonomous Claude agent

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.6.0

Apr 18, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ripple_mcp-0.6.0-py3-none-any.whl (318.7 kB view details)

Uploaded Apr 18, 2026 Python 3

File details

Details for the file ripple_mcp-0.6.0-py3-none-any.whl.

File metadata

Download URL: ripple_mcp-0.6.0-py3-none-any.whl
Upload date: Apr 18, 2026
Size: 318.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for ripple_mcp-0.6.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`778e14a1212bf16f89e29494e2cca94a2e10888dd2f7b119e2120c2214013b0e`
MD5	`7ab39b3647a199d986f05f0243146480`
BLAKE2b-256	`b49bdef7de70f7c085693e2460143a6da6e78d58f669210f3fdb239a58257940`

See more details on using hashes here.

ripple-mcp 0.6.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Ripple

Try it in 60 seconds — no GPU, no config

What it does

Blast Radius — +322% recall over static import graph

Guard — static semantic checks at 2.4ms/file

15 MCP Tools — plug into any AI coding assistant

Full setup

Prerequisites

Step 1 — Prepare workspace

Step 2 — Choose embedding provider

Step 3 — Build the index

Step 4 — Start the servers

Language support

Guardian Mode — CI/CD

Architecture

Troubleshooting

Research

Self-hosted. Your code never leaves your machines.

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes