Skip to main content

Shared code intelligence for agent fleets — AST-aware semantic search + multi-agent memory with git-concurrent coordination

Project description

CI License: MIT Python 3.11+ MCP

fleet-mem

Shared code intelligence for agent fleets. AST-aware semantic search, multi-agent memory, and git-concurrent coordination.

When multiple AI agents work on the same codebase, they fight. Agent A rewrites a function that Agent B is also modifying. Agent C searches for a pattern that Agent D already found and documented. Agents repeat work, create conflicts, and operate on stale information.

fleet-mem is a local MCP server that gives AI coding agents shared context:

  • Zero data leakage by default. Runs entirely on your machine using local Ollama embeddings. No cloud APIs, no telemetry, no data leaves your network. Cloud embedding providers (OpenAI, Gemini, etc.) are available as an opt-in choice.
  • Token-efficient code search. Understands the structure of your code via Abstract Syntax Trees (AST). Returns the specific function, not the entire file.
  • Shared memory across agents. Agent A discovers "this service uses JWT, not sessions." Agent B finds that knowledge automatically when working on the same code. Memories persist across sessions.
  • Fleet-aware coordination. Agents declare what files they are working on, get blocked on conflicts before they start, and get notified when another agent's merge affects their context.

Getting started

Prerequisites

  • Python 3.11+
  • Ollama running locally (brew, systemd, or Docker)
  • ollama pull nomic-embed-text

Install

git clone https://github.com/sam-ent/fleet-mem.git
cd fleet-mem
./scripts/setup.sh  # Creates venv, installs deps, registers MCP server

No manual venv activation needed. The MCP client runs fleet-mem using its own venv automatically.


Docker (alternative)

./scripts/docker-setup.sh

MCP client configuration for Docker:

{
  "mcpServers": {
    "fleet-mem": {
      "command": "docker",
      "args": ["exec", "-i", "fleet-mem-fleet-mem-1", "python", "-m", "fleet_mem.server"]
    }
  }
}

Mount your code as a volume to index it:

# Add to docker-compose.yml under fleet-mem.volumes:
- /path/to/your/projects:/projects:ro

Index your codebases

./scripts/index-repos.sh --root ~/projects

MCP client configuration

Add to your MCP client settings (the setup.sh script does this automatically for the default client):

{
  "mcpServers": {
    "fleet-mem": {
      "command": "/path/to/fleet-mem/.venv/bin/python",
      "args": ["-m", "fleet_mem.server"],
      "cwd": "/path/to/fleet-mem",
      "env": {
        "OLLAMA_HOST": "http://localhost:11434",
        "ANONYMIZED_TELEMETRY": "False"
      }
    }
  }
}

fleet-mem works with any MCP-compatible client. Your client starts it automatically on the first tool call.


Example agent queries

Once indexed, agents can ask things they could not do with grep:

  • "Find the authentication middleware and show me how tokens are validated"
  • "Which agent is currently working on the database schema?"
  • "What did other agents learn about the payment gateway this session?"
  • "If I merge this branch, which agents will have stale context?"

How it works

fleet-mem installs once as a global MCP server. It can index any number of projects. Each project gets its own collection in ChromaDB. All agents share the same server instance.

~/projects/
  project-a/     indexed as code_project-a
  project-b/     indexed as code_project-b
  project-c/     indexed as code_project-c

~/.local/share/fleet-mem/
  chroma/         vector embeddings (shared)
  memory.db       agent memories (shared)
  fleet.db        locks, subscriptions (shared)

Architecture

graph LR
    MCP[Any MCP Client] --> FM[fleet-mem]

    FM --> CS[Code Search]
    FM --> AM[Agent Memory]
    FM --> FC[Fleet Coord]

    CS --> C[(ChromaDB)]
    CS --> O[Ollama]
    AM --> C
    AM --> M[(memory.db)]
    AM --> O
    FC --> F[(fleet.db)]
    FC --> G[Git]

Components

Component What it is Why we chose it
Ollama Local ML inference server Runs embedding models on your machine at zero cost. Supports dozens of models. Works via Docker, systemd, or brew. Swappable via the Embedding base class
ChromaDB Vector database (HNSW) Purpose-built for similarity search over embeddings. Runs in-process, no separate server needed
SQLite + FTS5 Relational database with full-text search Agent memories need both keyword search and structured queries. FTS5 + ChromaDB vectors give hybrid ranking via reciprocal rank fusion
tree-sitter Incremental parsing library Splits code into semantic chunks (functions, classes, methods) instead of arbitrary character windows. Search results are meaningful code units, not fragments
xxHash (xxh3_64) File change detection + chunk IDs Detects which files changed between sync cycles. Not a security function, purely for diffing. ~10x faster than SHA-1

Language support

Language Splitting method Support level
Python, TypeScript, JavaScript AST-aware Tier 1: functions, classes, methods
Go, Rust AST-aware Tier 2: functions, types, impl blocks
All other languages Text-only Fallback: sliding window (2500 chars, 300 overlap)

AST-aware splitting means search results are complete, meaningful code units. Text-only fallback still works but may return partial functions. Adding a new language requires defining its tree-sitter node types in src/splitter/ast_splitter.py (contributions welcome).


Process flows


Indexing a codebase

Problem: Agents read entire files to understand code, burning tokens and missing context across files.

Solution: One-time indexing parses code into semantic chunks and embeds them. Agents search by meaning across the whole codebase.

sequenceDiagram
    participant S as Setup / Sync
    participant FM as fleet-mem
    participant TS as tree-sitter
    participant OL as Ollama
    participant C as ChromaDB

    S->>FM: index_codebase(path)
    FM->>FM: Walk files, skip .gitignore
    FM->>TS: Parse into AST
    TS-->>FM: Chunks (functions, classes)
    FM->>OL: Embed (batches of 64)
    OL-->>FM: Vectors
    FM->>C: Upsert chunks + vectors
    FM-->>S: {status: indexed}

Semantic code search

Problem: Grep requires exact strings. Agents don't know file names or function signatures in unfamiliar code.

Solution: Natural language query returns ranked code snippets with file paths and line numbers. No exact match needed.

sequenceDiagram
    participant A as Agent
    participant FM as fleet-mem
    participant OL as Ollama
    participant C as ChromaDB

    A->>FM: search_code("auth middleware")
    FM->>OL: Embed query
    OL-->>FM: Query vector
    FM->>C: Nearest-neighbor search
    C-->>FM: Top-K chunks + distances
    FM-->>A: [{file, lines, snippet, score}]

Storing and searching memory

Problem: Agents lose everything they learn when a session ends. The next agent re-discovers the same things from scratch.

Solution: Discoveries persist in a shared memory store. Any agent can find them later via keyword or semantic search.

sequenceDiagram
    participant A as Agent
    participant FM as fleet-mem
    participant M as memory.db
    participant OL as Ollama
    participant C as ChromaDB
    participant F as fleet.db

    A->>FM: memory_store("auth uses JWT")
    FM->>M: INSERT memory + FTS index
    FM->>OL: Embed content
    FM->>C: Upsert vector
    FM->>F: Notify matching subscribers

    A->>FM: memory_search("authentication")
    FM->>M: FTS5 keyword search
    FM->>OL: Embed query
    FM->>C: Vector search
    FM-->>A: Merged ranked results

File locking

Problem: Concurrent agents modify the same files, causing merge conflicts and wasted work.

Solution: Agents declare their work area before starting. Conflicts are caught immediately, not after hours of wasted effort.

sequenceDiagram
    participant A as Agent A
    participant B as Agent B
    participant FM as fleet-mem
    participant F as fleet.db

    A->>FM: lock_acquire(["src/auth/*"])
    FM->>F: INSERT lock
    FM-->>A: acquired

    B->>FM: lock_acquire(["src/auth/login.py"])
    FM->>F: Check overlap (fnmatch)
    FM-->>B: conflict (holder: A)

    A->>FM: lock_release()
    FM->>F: DELETE lock

    B->>FM: lock_acquire(["src/auth/login.py"])
    FM-->>B: acquired

Cross-agent knowledge sharing

Problem: Agent A discovers something important about the code. Agent B, working in the same area, has no way to know.

Solution: Agents subscribe to file patterns they care about. When another agent stores a discovery matching that pattern, subscribers are notified automatically.

sequenceDiagram
    participant A as Agent A
    participant B as Agent B
    participant FM as fleet-mem
    participant F as fleet.db
    participant M as memory.db
    participant C as ChromaDB

    B->>FM: memory_subscribe(["src/auth/*"])
    FM->>F: INSERT subscription

    A->>FM: memory_store("auth uses JWT")
    FM->>M: INSERT node
    FM->>C: Embed + store
    FM->>F: Match subscriptions, notify B

    B->>FM: memory_notifications()
    FM->>F: SELECT unread
    FM-->>B: ["auth uses JWT"]

Merge impact preview

Problem: Agent A merges a PR. Agents B and C are still working on branches that now have stale context. No one tells them.

Solution: Before merging, see exactly which agents, memories, and branches will be affected. After merging, one call notifies everyone and marks stale context.

sequenceDiagram
    participant AC as Agent / CI
    participant FM as fleet-mem
    participant F as fleet.db
    participant M as memory.db
    participant C as ChromaDB

    AC->>FM: merge_impact(["src/auth/login.py"])
    FM->>F: Query overlapping locks
    FM->>F: Query matching subscriptions
    FM->>C: Check stale branch overlays
    FM->>M: Query stale file anchors
    FM-->>AC: {locked, subscribed, stale}

    AC->>FM: notify_merge(branch, files)
    FM->>F: Create notifications
    FM->>M: Mark anchors stale

Embedding providers

The default is Ollama (local, free). fleet-mem also ships an OpenAI-compatible adapter that works with any provider offering an OpenAI-style embeddings API.

Provider Setup Cost
Ollama (default) Install Ollama, ollama pull nomic-embed-text Free
OpenAI Set EMBEDDING_PROVIDER=openai-compat, EMBED_API_KEY, EMBED_MODEL=text-embedding-3-small ~$0.02/1M tokens
DeepSeek Set EMBED_BASE_URL=https://api.deepseek.com/v1, EMBED_API_KEY, EMBED_MODEL=deepseek-embed ~$0.01/1M tokens
Gemini Set EMBED_BASE_URL=https://generativelanguage.googleapis.com/v1beta/openai/, EMBED_API_KEY, EMBED_MODEL=text-embedding-004 Free tier available
Together Set EMBED_BASE_URL=https://api.together.xyz/v1, EMBED_API_KEY, model of choice Varies
Local vLLM Set EMBED_BASE_URL=http://localhost:8000/v1, no API key needed Free

See .env.example for full configuration. For providers without an OpenAI-compatible API (Cohere, AWS Bedrock, Hugging Face), see docs/custom-embedding-providers.md. The adapter interface is four methods and typically under 30 lines.


Features

Code understanding

  • Semantic search: "find auth middleware" returns relevant functions, not string matches
  • Symbol lookup: find function/class definitions across indexed projects
  • Dependency analysis: trace what calls or imports a given symbol
  • Incremental sync: xxHash Merkle tree detects file changes, re-indexes only deltas
  • Branch-aware indexing: overlay collections for feature branches keep changes isolated from the main index

Fleet coordination

  • File lock registry: agents declare which files they are working on, others check before starting
  • Cross-agent memory: agents share discoveries via subscriptions and notifications
  • Merge impact preview: before merging, see which in-flight agents would be affected
  • Post-merge notification: after merging, automatically notify affected agents and mark stale context

Configuration

All settings via environment variables or a .env file in the project root. Copy .env.example to get started.

Variable Default Description
OLLAMA_HOST http://localhost:11434 Ollama API endpoint
OLLAMA_EMBED_MODEL nomic-embed-text Embedding model name
EMBEDDING_PROVIDER ollama Provider: ollama or openai-compat
CHROMA_PATH ~/.local/share/fleet-mem/chroma ChromaDB storage
MEMORY_DB_PATH ~/.local/share/fleet-mem/memory.db Agent memory database
FLEET_DB_PATH ~/.local/share/fleet-mem/fleet.db Fleet coordination database
SYNC_INTERVAL 300 Background code index sync (seconds)
MCP_SETTINGS_FILE ~/.claude/settings.json MCP client settings path

Background sync timing

What Timing How
Code index refresh Every SYNC_INTERVAL seconds (default: 300) Polls filesystem, computes xxHash digests, re-indexes changed files
Agent memory writes Immediate Direct SQLite + ChromaDB insert on memory_store call
Lock acquire/release Immediate Direct SQLite write
Notifications Immediate Created on memory_store if subscriptions match

For fast-moving multi-agent work, reduce SYNC_INTERVAL to 30-60. File-watching is also available for near-instant sync — set WATCH_ENABLED=true to detect changes immediately without polling.


Scripts

Script Purpose
scripts/setup.sh One-time install: venv, dependencies, Ollama check, MCP registration
scripts/index-repos.sh Find git repos under a root directory and index each one
scripts/import-flat-files.py Import existing memory files (markdown with YAML frontmatter)
scripts/embed-existing-nodes.py Embed existing memory DB nodes into ChromaDB for semantic search

Observability

fleet-mem includes OpenTelemetry tracing (zero new dependencies, uses the OTel SDK already bundled with ChromaDB). Disabled by default.

Quick start

# Enable tracing and point to your collector
OTEL_ENABLED=true
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317  # Jaeger, Grafana Tempo, etc.

What is traced

Span Key attributes
fleet.index project, chunk_count
fleet.search query_hash (never raw query), result_count
fleet.memory.store content_hash, node_type, agent_id
fleet.memory.search query_hash, result_count

All content is hashed in span attributes for privacy. Raw code and queries never appear in traces.

Fleet stats (no collector needed)

The fleet_stats MCP tool returns current metrics without requiring an external collector:

fleet_stats() -> {
  collections: {code_myproject: 1523},
  total_chunks: 1523,
  memory_nodes: 47,
  active_locks: 2,
  subscriptions: 5,
  pending_notifications: 1,
  cached_embeddings: 892
}

MCP tools reference

Code search (6 tools)

Tool Parameters Description
index_codebase path, branch?, force? Index a codebase (background). Branch-aware when branch specified
search_code query, path?, branch?, limit? Semantic code search across indexed projects
find_symbol name, file_path?, symbol_type? Find symbol definitions (functions, classes)
find_similar_code code_snippet, limit? Find code similar to a given snippet
get_change_impact file_paths?, symbol_names? Find code affected by changes to given files/symbols
get_dependents symbol_name, depth? Trace what calls/imports a symbol (BFS)

Agent memory (4 tools)

Tool Parameters Description
memory_store node_type, content, agent_id? Store a memory with optional file anchor
memory_search query, top_k?, node_type? Hybrid keyword + semantic memory search
memory_promote memory_id, target_scope? Promote a project memory to global scope
stale_check project_path? Find memories whose anchored files have changed

Fleet coordination (8 tools)

Tool Parameters Description
lock_acquire agent_id, project, file_patterns Declare files an agent is working on
lock_release agent_id, project Release file locks
lock_query project, file_path? Check who holds locks on which files
merge_impact project, files Preview which agents/memories are affected by a merge
notify_merge project, branch, merged_files Post-merge: notify affected agents, mark stale anchors
memory_feed agent_id?, since_minutes? Recent memories from other agents
memory_subscribe agent_id, file_patterns Subscribe to memories about specific files
memory_notifications agent_id Check for new relevant memories from other agents

Status and observability (5 tools)

Tool Parameters Description
get_index_status path Check indexing status for a project
clear_index path Drop a project's index and reset
get_branches path List indexed branches with chunk counts
cleanup_branch path, branch Drop a branch overlay after merge
fleet_stats Current metrics: chunks, memories, locks, cache hits, notifications

What's next

  • Go/Rust recursive AST splitting (promote to Tier 1)
  • Performance benchmarks on real codebases
  • MCP client configuration guides for Cursor, Windsurf
  • Agent workflow templates for common multi-agent patterns
  • Web dashboard for fleet status visualization

See roadmap.md for the full plan.


License

MIT

Acknowledgments

Architecture inspired by claude-context by Zilliz (MIT License). Design patterns informed by their TypeScript reference (vector database abstraction, embedding adapter, Merkle DAG, AST splitter). All code is an original Python implementation with significant additions (agent memory, fleet coordination, hybrid search, staleness detection).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fleet_mem-0.1.0.tar.gz (73.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fleet_mem-0.1.0-py3-none-any.whl (54.1 kB view details)

Uploaded Python 3

File details

Details for the file fleet_mem-0.1.0.tar.gz.

File metadata

  • Download URL: fleet_mem-0.1.0.tar.gz
  • Upload date:
  • Size: 73.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for fleet_mem-0.1.0.tar.gz
Algorithm Hash digest
SHA256 58338c306b362249a631e7e99a71648bf474db887ff32a6a9cd04e2f0d600f1f
MD5 b5e7fffcd20ae91e88a334cc8410a1d1
BLAKE2b-256 fd8f72d2b9e3565f8b14c4f9635e25da81d9489b2183e0ded8a95420424e5234

See more details on using hashes here.

File details

Details for the file fleet_mem-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: fleet_mem-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 54.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for fleet_mem-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 91df4519898f28cb793746a4df5c0a9438ea91eb2593155a4c0936e0b7ba6b48
MD5 e522891338fc471e96734e0795bdfea1
BLAKE2b-256 92c308c4062b766f991e034c4a91efdde88bf00f3176aebf7d200f59af1f5b41

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page