A local context layer for AI tools: mirror your repositories, index them into a knowledge graph, and serve it over MCP.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

contextlake, all your real context in one local lake. Pebble the otter surfacing from a misty lake cradling a glowing pebble of context.

contextlake

All your real context, in one local lake.

A local context layer for your AI tools: mirror your repositories, index them
into a knowledge graph, and serve it over MCP, so agents answer from real source instead of guessing.

Python 3.9+ Offline-first License: MIT

Why contextlake

Your AI assistant is only as good as what it can actually see. Point it at one file and it's sharp; ask it about the system, which service calls this API, who depends on that package, where a symbol is really defined across dozens of repos, and it starts guessing.

contextlake gives your tools the real source to read. It mirrors your repositories to your machine, indexes them into a queryable knowledge graph, and serves that graph to your editor over MCP. Everything runs locally and offline, no code leaves your machine, and it carries no credentials of its own.

How it works

contextlake is three layers you adopt one at a time. The mirror is useful on its own, and each layer above it is optional.

contextlake architecture. On the left, your repos: a GitLab group, plus optional Figma, Jira, and other MCP connectors. In the centre, contextlake indexes and mirrors them into a graph and embeddings, a wiki, and connectors. On the right, it serves the result over MCP to your AI tools: Claude Code, Windsurf, Kiro, Cursor, and Postman.

Mirror: clone every repo you can reach in a GitLab group into a faithful copy of its namespace tree, each on its most active branch, kept fresh with one command. (The source is GitLab today; the design is source-agnostic.)
Knowledge layer (optional): parse the mirror into a code + dependency graph, add semantic search, a council-verified wiki, and connectors to Atlassian / Figma / GitLab.
Serve: expose it all over MCP and an offline interactive graph visualizer, so agents can answer "where is X defined?" or "who calls Y?" instead of grepping.

Each layer has its own guide: the mirror in Usage & config, the knowledge layer and serving in Knowledge layer, and the whole flow start to finish in QUICKSTART.

Install

pip install contextlake             # the mirroring CLI
pip install "contextlake[kb]"       # + the knowledge layer (graph, search, wiki, MCP server)

Prefer an isolated, zero-setup install? uv fetches the right Python and an isolated environment for you:

uv tool install "contextlake[kb]"            # install the CLI on your PATH
uvx --from "contextlake[kb]" contextlake --help   # …or run it once, without installing
# pipx install "contextlake[kb]"             # pipx works too

From source (for contributors)

git clone https://github.com/sayak-sarkar/contextlake && cd contextlake
pip install -e ".[kb]"

Prerequisites: git, and, only for GitLab mirroring, an authenticated glab (glab auth login). The knowledge layer needs neither. Once installed, contextlake, python -m contextlake, and python3 contextlake.py are equivalent.

Quickstart: one repo, no setup

You don't need GitLab or any config to try contextlake on a repo you already have. No install? Run it once with uvx: prefix any command below with uvx --from "contextlake[kb]" (e.g. uvx --from "contextlake[kb]" contextlake index --source .).

contextlake index --source .          # parse this repo into a local knowledge graph
contextlake graph --overview --open   # open the interactive graph in your browser
contextlake serve                     # …or serve it to your AI IDE over MCP

Wire it into your editor in one line, no config file needed (it uses the local ~/.contextlake/kb store you just built):

claude mcp add contextlake -- contextlake serve      # Claude Code
# zero-install variant: claude mcp add contextlake -- uvx --from "contextlake[kb]" contextlake serve

The contextlake graph visualizer showing a repository's symbols as a navigable node graph, with a type-glyph legend, search, and a corner minimap

contextlake graph, a whole codebase as one offline, navigable graph.

Everything lands in a local store (~/.contextlake/kb), nothing leaves your machine. Index any path with --source PATH, or every git repo under a directory with --workspace DIR.

Want the full path, mirror a GitLab fleet → graph → wired editor in a few minutes? QUICKSTART.md walks the whole flow.

Fleet mode: mirror a GitLab group

Where contextlake goes beyond single-repo tools is mirroring and cross-referencing a whole GitLab fleet. Copy the example config and set your group + workspace:

cp .contextlake.ini.example ~/.contextlake.ini

[contextlake]
work_dir = ~/work
gitlab_group = your-gitlab-group

contextlake status      # see where you stand (read-only)
contextlake sync        # fetch → clone → update → branches → verify → audit

It carries no credentials of its own (auth rides on your existing glab login), so .contextlake.ini holds only non-secret settings and is gitignored by default. It runs across hundreds of repos concurrently, with an adaptive worker pool, retries with backoff, and never stomps on the feature branch you're in the middle of.

Behind a slow / TLS-inspecting corporate proxy (e.g. Zscaler) where glab's API calls time out? Set GITLAB_TOKEN (a read_api token) and contextlake enumerates projects via its own HTTP client, which tolerates the slow DNS where glab's short dial timeout fails.

Commands at a glance

Run any command as contextlake <command>. Full per-command docs: docs/usage.md.

Command	What it does
`status`	Show the workspace sync state vs GitLab (read-only)
`sync`	The full pipeline: fetch → clone → update → branches → verify → audit
`fetch` · `clone` · `update`	The sync steps, individually
`branches`	Switch each repo to its most active branch
`verify` · `audit`	Check the mirror vs GitLab; report repo health, age & drift (JSON + CSV)
`bootstrap`	Turnkey: sync + index + connect + embed + wiki + steer
`index`	Build the code/dependency graph (`--workspace`, incremental, `--watch`)
`connect`	Link repos to Atlassian / Figma / GitLab items
`embed`	Build semantic-search vectors (zero-config built-in CPU model, Ollama, or an API)
`wiki`	LLM-synthesized, council-verified wiki pages
`query`	Search the index (`--kind`, `--repo`, `--as-of <commit>`)
`graph`	Visualize the graph, offline interactive HTML / DOT / Mermaid / JSON
`serve`	Expose the graph over MCP (`--transport stdio`/`http`)
`steer`	Write editor steering, `AGENTS.md`, `.mcp.json`, `.windsurfrules`, skills
`lint` · `doctor` · `eval`	Graph health · environment check · retrieval-quality scoring

Global options apply to any command: --dry-run (preview without changing anything), -v/-q (verbosity), --log-file PATH, --config PATH, --version. Output is colorized on a TTY and plain when piped; set NO_COLOR to force-disable.

Knowledge layer

Beyond mirroring, the optional contextlake.kb layer turns your repos into a knowledge graph and serves it to AI tools over MCP. It can link repos to their Atlassian / Figma / GitLab items, add semantic search, write a curated wiki, visualize the graph (offline interactive HTML, fleet overview, a symbol's neighbourhood, or a single repo), and generate per-tool steering files + a skills library. Most of it needs no model; the rest works with a local Ollama or any OpenAI-compatible endpoint.

One command sets it all up:

contextlake bootstrap --kb-config ~/.contextlake/kb.toml

Full guide: docs/knowledge-layer.md.

Documentation

QUICKSTART.md, install → bootstrap → wire your editor, in minutes
docs/usage.md, every command, configuration, branch safety, scheduling
docs/knowledge-layer.md, the graph, connectors, search, wiki, steering
docs/internals.md, architecture & internals
docs/releasing.md, maintainer runbook: versioning, tagging, publishing
CHANGELOG.md · ROADMAP.md · CONTRIBUTING.md · BRANDING.md

License

MIT, see LICENSE. Pebble the otter is the project mascot; deep context, clear answers.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

sayak-sarkar

These details have not been verified by PyPI

Release history Release notifications | RSS feed

2.11.0

Jun 28, 2026

This version

2.10.0

Jun 27, 2026

2.9.1

Jun 26, 2026

2.9.0

Jun 26, 2026

2.8.0

Jun 26, 2026

2.7.0

Jun 25, 2026

2.6.0

Jun 25, 2026

2.5.1

Jun 25, 2026

2.5.0

Jun 25, 2026

2.4.0

Jun 25, 2026

2.3.0

Jun 25, 2026

2.2.0

Jun 23, 2026

2.1.6

Jun 22, 2026

2.1.5

Jun 22, 2026

2.1.4

Jun 22, 2026

2.1.3

Jun 22, 2026

2.1.2

Jun 22, 2026

2.1.1

Jun 22, 2026

2.1.0

Jun 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

contextlake-2.10.0.tar.gz (286.0 kB view details)

Uploaded Jun 27, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

contextlake-2.10.0-py3-none-any.whl (288.9 kB view details)

Uploaded Jun 27, 2026 Python 3

File details

Details for the file contextlake-2.10.0.tar.gz.

File metadata

Download URL: contextlake-2.10.0.tar.gz
Upload date: Jun 27, 2026
Size: 286.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for contextlake-2.10.0.tar.gz
Algorithm	Hash digest
SHA256	`3f265d864f790b982ed0ce6e6f74b397358176391ac93737f28d18924334fab9`
MD5	`1f57854ec4675c1ce9fc3f85cbdbfb07`
BLAKE2b-256	`11a27efa6fdc2f7edfe83cabe9388c10871f153bf8c63035a7dc1f44b77a3461`

See more details on using hashes here.

Provenance

The following attestation bundles were made for contextlake-2.10.0.tar.gz:

Publisher: release.yml on sayak-sarkar/contextlake

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: contextlake-2.10.0.tar.gz
- Subject digest: 3f265d864f790b982ed0ce6e6f74b397358176391ac93737f28d18924334fab9
- Sigstore transparency entry: 1986434336
- Sigstore integration time: Jun 27, 2026
Source repository:
- Permalink: sayak-sarkar/contextlake@2694465970ed3bbe854d2f8956ee3228eac6ef00
- Branch / Tag: refs/tags/v2.10.0
- Owner: https://github.com/sayak-sarkar
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@2694465970ed3bbe854d2f8956ee3228eac6ef00
- Trigger Event: push

File details

Details for the file contextlake-2.10.0-py3-none-any.whl.

File metadata

Download URL: contextlake-2.10.0-py3-none-any.whl
Upload date: Jun 27, 2026
Size: 288.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for contextlake-2.10.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7169e4a2f73997a640f13ded83aa661cb7b75c547e3c3e37c8b24f8df2946342`
MD5	`1d65f701ea03a3abdccc959b5ab6fabf`
BLAKE2b-256	`c6809e4ee65b2176272572e5dbc022e45d7a2aa1e58c6d05b5773b1af660ae64`

See more details on using hashes here.

Provenance

The following attestation bundles were made for contextlake-2.10.0-py3-none-any.whl:

Publisher: release.yml on sayak-sarkar/contextlake

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: contextlake-2.10.0-py3-none-any.whl
- Subject digest: 7169e4a2f73997a640f13ded83aa661cb7b75c547e3c3e37c8b24f8df2946342
- Sigstore transparency entry: 1986434549
- Sigstore integration time: Jun 27, 2026
Source repository:
- Permalink: sayak-sarkar/contextlake@2694465970ed3bbe854d2f8956ee3228eac6ef00
- Branch / Tag: refs/tags/v2.10.0
- Owner: https://github.com/sayak-sarkar
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@2694465970ed3bbe854d2f8956ee3228eac6ef00
- Trigger Event: push

contextlake 2.10.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

contextlake

Why contextlake

How it works

Install

Quickstart: one repo, no setup

Fleet mode: mirror a GitLab group

Commands at a glance

Knowledge layer

Documentation

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance