Skip to main content

Local MCP context bus for coding LLMs (Claude Code, Cursor, Codex, Gemini CLI, Antigravity, Copilot, VS Code, opencode).

Project description

Wevex

Local MCP context bus for coding LLMs.
One daemon, every coding client connected to the same typed context.

Python 3.9+ License: Apache 2.0


The problem

Every LLM tool ships its own memory silo. Switching tools = starting from zero. Two agents on the same project can't see each other's work. Copy-paste between Claude Code, Cursor, Windsurf, Hermes, and a dozen other tools is the actual current solution.

The solution

One local daemon. Every coding client connects via MCP (or AGENTS.md for non-MCP tools). They share typed fragments (decisions, state, observations, requirements) per scope (project / team / org), coordinate via advisory leases, and all get the same rendered AGENTS.md.

  Claude Code  Cursor  Windsurf  Kiro  Codex  Hermes  Goose  Crush  gptme
       │           │        │      │      │       │       │      │      │
       └──────────── MCP Streamable HTTP (127.0.0.1:8765/mcp) ─────────┘
  Continue.dev  Antigravity  VS Code / Copilot  opencode  (+ more via wevex connect)
       │               │              │              │
       └───────────────┴──────────────┴──────────────┘
                                    │
                       ┌────────────┴────────────┐
                       │   FastAPI + MCP server  │  one process, one port
                       └────────────┬────────────┘
                                    │
                          SQLite + FTS5 + numpy
                    (fragments · commits · leases · scopes)
                                    │
              ┌──────────────────┬──┴──────────────────┐
              │   CLI (wevex)    │                      │
              │  init/serve/sync │    AGENTS.md (file)  │
              │  remember/recall │    Pi · OpenClaw     │
              │  lease/doctor    │    (non-MCP fallback) │
              └──────────────────┴─────────────────────┘

Quick start

Two commands. That's it.

# 1. Install (once)
pip install wevex

# 2. Activate in any project
cd ~/Documents/your-project
wevex up

Other install paths that work the same:

pipx install wevex          # recommended for CLI tools — isolated env, auto-PATH
uv tool install wevex       # modern, fastest
pip3 install wevex          # macOS users where `pip` points at Python 2
py -m pip install wevex     # Windows

After wevex up, every detected client automatically has shared context for the project. The daemon runs as a background service that survives terminal close and reboots on all three OSes — launchd agent on macOS, systemd-user unit on Linux, Scheduled Task (logon trigger, restart-on-failure) on Windows.

Supported clients

wevex connect auto-detects and wires up any of these:

Client MCP config written Notes
Claude Code claude mcp add wevex Via CLI; bearer token in header
Cursor .cursor/mcp.json
Windsurf .windsurf/mcp.json Uses serverUrl key
Kiro .kiro/settings/mcp.json AWS spec-first IDE (GA May 2026)
VS Code / Copilot .vscode/mcp.json One entry covers both
Codex CLI .codex/config.toml TOML [[mcpServers]] block
Antigravity ~/.gemini/antigravity/mcp_config.json Google's Gemini CLI replacement
Hermes ~/.hermes/config.yaml + ~/.hermes/.env Nous Research; token in env var
Goose ~/.config/goose/config.yaml Block; streamable_http transport
Crush .crush.json Charm; explicit "type": "http"
gptme ~/.config/gptme/config.toml TOML [[mcp.servers]] block
Continue.dev ~/.continue/mcpServers/wevex.yaml Dedicated block file
opencode ~/.config/opencode/config.json
Gemini CLI ~/.gemini/settings.json Sunset June 18 2026; use Antigravity

Pi.dev: no native MCP support (deliberate design choice). A community adapter exists at pi-mcp-adapter but is unmaintained. Wevex does not auto-detect Pi.

Windows notes

wevex up on Windows registers a Scheduled Task named Wevex\Daemon at the current user's logon — no admin elevation needed. The XML definition pins RestartOnFailure to the same 3-retry/1-minute interval the launchd KeepAlive and systemd Restart=always paths use, so reboot persistence and crash recovery match the POSIX backends. State lives under %APPDATA%\wevex\ (the same set of files macOS/Linux keep in ~/.config/wevex/). To see or remove the task by hand:

schtasks /Query  /TN "Wevex\Daemon" /V
schtasks /Delete /TN "Wevex\Daemon" /F   # equivalent to `wevex down`

If schtasks.exe is missing (Server Core, nano-server, stripped containers) the daemon falls back to a CREATE_NEW_PROCESS_GROUP | DETACHED_PROCESS nohup-style spawn — survives terminal close but not reboot. Run wevex up again on first login in that case.

wevex up is idempotent — safe to run repeatedly. It does:

1. Init Generate bearer token + config (~/.config/wevex/config.json) on first run
2. Daemon Install + start a background service (auto-relocates the venv to ~/.wevex/venv if needed for macOS TCC)
3. Scope Auto-detect from git remote (git@github.com:user/repo.gitproject:repo) or cwd name
4. Hooks Drop .claude/settings.json + .cursor/rules/wevex.mdc + .wevex/scope so every LLM auto-recalls and auto-remembers
5. Sync Write MCP configs for every detected client (Claude Code, Cursor, Windsurf, Kiro, Codex, Hermes, Goose, Crush, gptme, Continue.dev, Antigravity, VS Code / Copilot, opencode) + AGENTS.md + CLAUDE.md
6. Ingest Index the codebase for semantic search — incremental, free re-runs
7. Watcher Spawn a session-scoped subprocess that re-ingests changed files within ~2 seconds. Survives terminal close; dies on logout (re-spawned by next wevex up)

To turn it off:

wevex down       # stop the daemon and remove hooks from this project
wevex restart    # restart the daemon
wevex status     # see what's running (daemon + clients + counts)
wevex doctor     # deeper diagnostic (logs, value distribution, inbox depth)

After wevex up — day-to-day usage

Wevex is meant to be invisible. Most work happens through MCP tools your LLM already calls — recall, remember, note_decision, search_code, boost, bury, archaeology, supersede — so the user surface is small on purpose. The five CLI commands you'll actually reach for:

# See what's wired up
wevex status

# Deep diagnostic (folds in the old `clients`, `events`, `chunks stats`,
# `projects list`, `daemon status`, value-distribution, inbox depth)
wevex doctor

# What's the project state?
wevex briefing
wevex briefing --since 2d        # what changed in the last 2 days

# Live event stream
wevex tail

# Interactive control panel
wevex tui

wevex doctor has two action flags that absorb what used to be standalone commands:

wevex doctor --clean       # interactive cleanup (replaces `wevex gc`)
wevex doctor --reingest    # re-index the cwd (replaces `wevex ingest`)

The daemon takes care of everything else automatically — AGENTS.md regen, TTL gc, docs sync, inbox auto-approve. You don't run a sync command; the file just stays current.


Fragment types

Type Default TTL When to use
preference 90 days "prefer async/await over callbacks"
fact 30 days "API rate limit is 1000 req/min"
decision 30 days "use Redis for caching layer"
state 7 days "API schema v3 is current"
observation 14 days "auth middleware has a race condition"
requirement permanent "users must be able to export their data"
procedure permanent runbooks, how-tos
conversation 30 days extracted from chat threads

CLI reference

The visible surface is intentionally small — Wevex is an invisible helper, not a CLI suite to learn. Ten commands cover everything a human needs:

wevex up [path]    Start the daemon, register cwd, wire up detected clients
wevex down         Stop daemon + watcher, uninstall hooks
wevex restart      Restart the daemon
wevex status       One-screen health: daemon, clients, fragment + chunk counts
wevex doctor       Deep diagnostic. Flags: --clean / --reingest / --perf
wevex tail         Live event stream
wevex briefing     Project state. With --since <when>, becomes the diff feed
wevex tui          Interactive control panel
wevex config       View or set runtime configuration
wevex connect      Wire installed LLM tools through Wevex. --remove disconnects

What the daemon does automatically (no command, no maintenance):

  • Regenerates AGENTS.md for each registered project when fragments change.
  • Drains the extraction-candidate inbox: auto-approves above the confidence threshold, auto-rejects items older than 14 days that didn't clear.
  • Sweeps expired TTLs (lease + fragment stale-mark).
  • Tails docs (README/CHANGELOG/ADRs) into fragments via the docs watcher.

Need a fine-grained command from a previous version? Most still work but are hidden from --help and will be removed in a follow-up — prefer the MCP tool path (recall, remember, note_decision, boost, bury, archaeology, supersede) for anything an LLM session needs to do.


MCP tools (for Claude Code, Cursor, Codex, etc.)

Once wevex up runs and your client is connected, every MCP-capable LLM gets these tools:

Tool Description
project_briefing(scope?) One-call project dashboard (300 tokens, <50ms)
recall(query, scope, types?, limit?) Search for relevant context fragments
recall_one(fragment_id) Full content of a specific fragment
remember(content, type, scope, territory?, tags?) Store context
note_decision(content, scope, alternatives?, rationale?) Record a decision with structure
supersede(old_id, new_content, reason?, type?, tags?) Retire a fragment + create its replacement atomically
boost(fragment_id, value?) Pin a fragment to high recall-value when the user says "this is important"
bury(fragment_id) Floor a fragment's value when the user says "this is wrong" — kept in audit, hidden from recall
archaeology(query, scope?, limit?) Reconstruct a decision's provenance (session, commit, supersede chain)
search_code(query, scope, languages?, source_root?, limit?) Hybrid search over the ingested codebase
claim_lease(glob, scope, ttl_seconds?) Advisory lock on file-glob
release_lease(lease_id) Release a lease
query_leases(scope?) List active leases

MCP resources (auto-injected):

Resource URI Content
context://{scope}/state Current state fragments
context://{scope}/decisions All active decisions
context://{scope}/agents-md Rendered AGENTS.md
context://{scope}/recent-commits Last 20 commits

MCP prompt:

Prompt Description
session_start Auto-inject AGENTS.md + 5 relevant fragments at session start

Autonomous mode

wevex up (which also runs on wevex connect) turns Wevex from "tools the LLM can call" into "context that flows automatically without anyone asking." It writes a small set of files into the current project:

File Purpose
.wevex/scope Pins which scope hooks should use for this project
.claude/settings.json Registers SessionStart, UserPromptSubmit, Stop, PostToolUse hooks
.cursor/rules/wevex.mdc Auto-applied Cursor rule that tells Composer to call recall/remember proactively

What each hook does:

Hook Fires on Effect
SessionStart Claude Code session opens Injects the most-relevant requirement / decision / state fragments into the session as context
UserPromptSubmit Every user prompt to Claude Code Recalls fragments scoring above a threshold for the prompt and injects them
Stop Claude finishes a turn Scans the assistant message for decision-shaped sentences (e.g. "I decided to use FastAPI") and persists them as decision fragments tagged auto-extracted
PostToolUse After Edit / Write / MultiEdit Records the file path as an observation fragment with the territory (e.g. backend/auth)

The hooks talk to the SQLite DB directly, so they:

  • Run with sub-100ms latency on the warm path (no HTTP round trip)
  • Continue working when the daemon isn't running
  • Use whatever embedding provider is configured (auto-skip vector if none works)
  • Cold-path (first hook firing after a long idle) is closer to 1 s due to Python interpreter + module imports; warm-up scripts in ~/.wevex/cache/ mitigate this on macOS but it's not free

Multi-tool scenario: Open Claude Code in a hook-installed project and ask it to make a decision. Close it. Open Cursor in the same project — it reads AGENTS.md (rendered from the same fragments) and reads its rule file (which tells it to recall proactively). Cursor sees the decision Claude just made.

# Run once per project — installs hooks for every detected client:
cd ~/Documents/your-project
wevex up

# What's wired up?
wevex status

# Disconnect a client (removes the hooks):
wevex connect cursor --remove
wevex connect --all --remove   # disconnect everything

wevex up writes the hooks into .claude/settings.json automatically (the per-project .wevex/scope file pins which scope to use).


Codebase RAG

Fragments are great for typed context — decisions, requirements, observations. They're not great for "find the function that does X across this 50k-LOC codebase." For that, Wevex has a separate chunks layer.

Ingest a codebase

wevex up runs the initial ingest and registers a file-watcher that keeps the chunk index current automatically. The visible knobs:

# First-time setup: indexes the cwd and starts the watcher.
wevex up

# Force a full re-index (replaces the old `wevex ingest .`):
wevex doctor --reingest

# Re-embed every fragment under the active embedding provider
# (e.g. after switching from gemini to fastembed):
wevex doctor --reindex-embeddings

What gets ingested:

  • Languages auto-detected by extension: Python, JS, TS, Go, Rust, Java, Kotlin, Swift, Ruby, PHP, C, C++, C#, Scala, Clojure, Elixir, Haskell, Lua, Shell, SQL, Markdown, YAML, JSON, TOML, HTML, CSS, Dockerfile, Terraform, Protobuf, GraphQL, Vue, Svelte, Dart, R, …
  • Excluded by default: .git, node_modules, __pycache__, venv, .venv, dist, build, target, .next, .cache, _archive_v2, …
  • Skipped: files larger than 512 KB (override with --max-bytes), binary files, anything that fails UTF-8 decode

Each file is split into overlapping line windows (default 80 lines, 10-line overlap). Each chunk is hashed with SHA-256 — re-ingesting a file whose contents haven't changed is free (no DB write, no re-embedding API call).

Search the ingested code

The agent is the canonical caller — the MCP search_code tool is what Claude Code, Cursor, Codex, etc. invoke directly. Humans see the same ranking through the REST endpoint:

# MCP — Claude Code, Cursor, etc. call this directly
search_code(query="how does auth work", scope="project:myapp")

# REST (same hybrid pipeline)
curl -X POST http://127.0.0.1:8765/v1/chunks/search \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"query":"auth bearer","scope":"project:myapp","limit":5}'

Results return file path + line range + the matched chunk content, ranked by hybrid BM25 + vector + RRF.

Stats / inspection

Per-scope chunk counts, language breakdown, and root summaries appear in wevex doctor (alongside fragment counts, value distribution, and inbox depth). One-off cleanup is wevex doctor --clean.

Why chunks separate from fragments?

Layer Use case Typical count What you put there
Fragments Typed atomic context (the bus) 100s–10ks Decisions, requirements, preferences, observations
Chunks Codebase / document RAG 1k–1M Code files, docs, READMEs, ADRs

Both share the same scope hierarchy and hybrid-retrieval pipeline, but they're indexed separately so a "what was the auth decision?" query (recall) doesn't compete with "show me code that touches auth" (search).

Scaling notes

The default vector search streams chunk embeddings from SQLite in batches of 5,000 and computes cosine similarity with NumPy. On a modern laptop, measured numbers from wevex doctor --perf on a 566-chunk repo: code search p50 ≈ 14 ms (worst-case 27 ms). The brute-force cosine is O(N) and starts to feel slow past ~50k chunks — for larger codebases, swap in sqlite-vec or usearch; both can drop in behind Storage.chunks_vector_search without schema changes.

Verify against your own DB with wevex doctor --perf.


REST API

GET  /health                    Public health check + stats
POST /v1/scopes                 Create a scope
GET  /v1/scopes                 List scopes
GET  /v1/scopes/{handle}/lineage Scope hierarchy
POST /v1/fragments              Create a fragment (auto-embeds, auto-commits)
GET  /v1/fragments              List fragments (filter by scope/type)
GET  /v1/fragments/search       Keyword+semantic search (GET)
POST /v1/fragments/recall       Hybrid search (POST with full RecallRequest)
GET  /v1/fragments/{id}         Get fragment
PATCH /v1/fragments/{id}        Update with OCC (send expected_version)
DELETE /v1/fragments/{id}       Soft-delete
GET  /v1/commits                Commit log
GET  /v1/commits/{id}           Single commit
POST /v1/leases                 Acquire advisory lease
GET  /v1/leases                 List leases
DELETE /v1/leases/{id}          Release lease
POST /v1/chunks/search          Codebase RAG hybrid search
GET  /v1/chunks/search          Same, GET form
GET  /v1/chunks                 List indexed chunks
GET  /v1/chunks/stats           Per-scope chunk counts by language/root
DELETE /v1/chunks/{root}        Delete every chunk under a source_root
POST /mcp                       MCP JSON-RPC endpoint

Interactive docs at http://127.0.0.1:8765/docs.


Architecture decisions

Decision Chosen Alternatives Reason
Storage SQLite + FTS5 + numpy Postgres + pgvector Zero external deps; portability
Retrieval Hybrid BM25 + vector + RRF Vector-only +20% recall@10
MCP transport Streamable HTTP (hand-rolled) mcp[cli] SDK Python 3.9 compat; minimal deps
Auth Local bearer token OAuth 2.1 Sufficient for local-first v1
Embeddings hash (offline, deterministic) Gemini, OpenAI Works with zero API keys; swap via config
Instruction file AGENTS.md (canonical) CLAUDE.md per-tool ~60k repos; every major client reads it
Coordination Advisory leases + OCC CRDTs Right abstraction for semantic conflicts

See 20 Projects/Company Brain/Wevex - Pivot ADR Log.md for full ADR history.


Configuration

Config file: ~/.config/wevex/config.json (created by wevex init).

Key Default Env override
port 8765 WEVEX_PORT
host 127.0.0.1 WEVEX_HOST
db_path ~/.config/wevex/wevex.db WEVEX_DB_PATH
embedding_provider fastembed WEVEX_EMBEDDING_PROVIDER
bearer_token (generated) WEVEX_BEARER_TOKEN
default_scope project:default WEVEX_DEFAULT_SCOPE

Embedding providers: fastembed (local BAAI/bge-small-en-v1.5, 384-dim, default — no API key, ~130 MB one-time model download), openai (cloud, requires OPENAI_API_KEY), bm25 (FTS5-only, no vector ranking), hash (tests-only).


Running tests

python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[test]"
pytest tests/ -v

444 tests (as of iter 15) covering storage, retrieval, REST API, MCP JSON-RPC, AGENTS.md renderer, sync (including Antigravity), the autonomous hook system, the codebase RAG layer (ingest + chunks search), the cross-platform daemon manager (launchd / systemd-user / nohup with TCC auto-relocate), scope auto-detection from git remotes, the .wevex/scope pin honoring across all CLI commands, the active-projects registry, the watcher's incremental dispatch, the code scanner, the git-commit decision watcher, the bm25/gemini provider auto-detection, the .git-required guard on wevex up, and wevex doctor --perf self-measurement.


Scopes

Scopes form a visibility hierarchy: public ⊃ org ⊃ team ⊃ project ⊃ personal.
A recall query on project:foo returns fragments from that project, its team, org, and public.

Scopes are inferred automatically — wevex up picks one from .wevex/scope, the git remote, or the directory name. Sub-scopes (org:, team:) are created by the daemon the first time a fragment references them; you never run a scope create command.


Embedding quality

The default fastembed provider runs BAAI/bge-small-en-v1.5 locally (384-dim, ONNX). First daemon startup downloads ~130 MB of model weights to ~/.cache/fastembed/; after that every embedding is a sub-15 ms CPU call with no API key and no rate limits. Recall is hybrid: BM25 (FTS5) fused with vector cosine via Reciprocal Rank Fusion.

If you want OpenAI's text-embedding-3-small instead:

export OPENAI_API_KEY=your-key
pip install 'wevex[openai]'
wevex config set embedding_provider openai

If you can't or don't want to install fastembed, fall back to keyword-only:

wevex config set embedding_provider bm25

The previous gemini embedding provider was removed in iter 27 — its rate limits wedged the daemon's event loop. Existing configs naming gemini are silently aliased to fastembed on next load. Gemini CLI (the terminal agent) is being sunset by Google on June 18, 2026; Antigravity is the replacement and is already a fully-supported sync target. Wevex continues to detect and connect gemini_cli for users still on it.


License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wevex-0.2.0.tar.gz (316.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wevex-0.2.0-py3-none-any.whl (246.6 kB view details)

Uploaded Python 3

File details

Details for the file wevex-0.2.0.tar.gz.

File metadata

  • Download URL: wevex-0.2.0.tar.gz
  • Upload date:
  • Size: 316.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for wevex-0.2.0.tar.gz
Algorithm Hash digest
SHA256 60151e4a1cc167cfbd695c970ac550c19beff39b8d6e37c2c3461d4184ba4383
MD5 966e5350ae24122337b10feef1e65d01
BLAKE2b-256 4e953180f57a334932efc680f72e2df5e2f645775b45365b38eab8227bdec726

See more details on using hashes here.

Provenance

The following attestation bundles were made for wevex-0.2.0.tar.gz:

Publisher: publish.yml on Asanali111/wevex

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file wevex-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: wevex-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 246.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for wevex-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9bdef8c8c2830e07e939ad30468160e82da838fc63e3162f4927017d7eadf717
MD5 31a301be3ee695464bc67fd4c2ceb235
BLAKE2b-256 6d775a7bffc1c58637a532cd9b63a7ab11746f249b161102587081a50d3c2422

See more details on using hashes here.

Provenance

The following attestation bundles were made for wevex-0.2.0-py3-none-any.whl:

Publisher: publish.yml on Asanali111/wevex

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page