Skip to main content

Give your AI coding agent persistent memory — stop re-reading files every conversation. MCP server for Claude Code, Claude Desktop, Cursor, and other AI coding tools.

Project description

Stele Context

Persistent memory for AI coding agents. A zero-dependency MCP server that lets Claude Code, Claude Desktop, Cursor, and any MCP-compatible AI coding assistant remember your codebase between conversations — instead of re-reading every file from scratch.

PyPI License: MIT Python 3.10+ Zero Dependencies Tests

The Problem: AI Agents Re-Read Everything

Every new conversation with Claude Code, Cursor, or any other LLM coding tool starts from zero. The agent re-reads the same files it read yesterday, burning thousands of tokens on code that hasn't changed. On a medium-sized project that's real money and real context-window space spent re-learning what the agent already knew.

Stele Context is a local context cache that fixes this: index once, then only pay for what actually changed.

What It Does

  1. Indexes your project files once — code, docs, configs, even images and PDFs — into a local SQLite database
  2. Detects file changes with an mtime+size fast path and SHA-256 verification — unchanged files cost zero re-reads
  3. Returns a diff instead of the whole file when something did change — a 1-line edit in a 600-line file comes back as a ~60-token unified diff, not a 10,000-token re-read
  4. Searches your codebase by meaning, keyword, or exact pattern — semantic code search, BM25, and token-budgeted grep in one tool
  5. Maps how your code connects — a symbol graph for find-references, go-to-definition, and "what breaks if I change this file?" impact analysis

Everything runs 100% offline on your machine. No internet, no API keys, no cloud, no telemetry. Just Python's standard library and SQLite.

Semantic search demo

Quick Start

Install

pip install stele-context

Index your project

stele-context index src/ docs/ README.md

This chunks your files and stores them in a .stele-context/ folder in your project root. Indexing respects your .gitignore out of the box, so node_modules/, build output, and secrets stay out of the index.

Search your code

stele-context search "how does authentication work"
stele-context search-text "TODO" --regex
stele-context agent-grep "createApp" --group-by file

Connect it to Claude Code, Claude Desktop, or Cursor

Stele Context runs as an MCP server — the standard protocol for giving AI agents extra tools.

pip install stele-context[mcp]

Claude Code — add to ~/.claude/settings.json:

{
  "mcpServers": {
    "stele-context": {
      "command": "stele-context",
      "args": ["serve-mcp"]
    }
  }
}

Claude Desktop — add to ~/.config/Claude/claude_desktop_config.json (Linux/Mac) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "stele-context": {
      "command": "stele-context",
      "args": ["serve-mcp"]
    }
  }
}

Cursor, Windsurf, and other MCP clients — point them at the same stele-context serve-mcp command; the configuration shape is the same.

Tip: If you installed in a virtualenv, use the full path: run which stele-context to find it.

Once connected, your agent gets ~32 tools for caching, searching, and navigating your code — it uses them automatically when they're helpful. (Set STELE_MCP_MODE=lite for ~15 essential tools, or STELE_MCP_MODE=full for the complete surface.)

Who Is This For?

  • You use Claude Code, Claude Desktop, Cursor, or another AI pair-programming tool
  • You're tired of watching your agent re-read the same files at the start of every session
  • You want to cut token costs and stop wasting context window on unchanged code
  • You want local code search that understands your project — not a cloud RAG pipeline
  • You care about supply chain security and want tooling you can audit end to end

If you've ever wished your AI coding assistant had a memory of your project, that's exactly what this is.

What Can It Do?

For everyday use

What you want How Stele helps
"Don't re-read files that haven't changed" get_context returns cached content for unchanged files — change detection via mtime+size fast path with SHA-256 verification
"Show me only what changed in a file I already read" Changed files come back with diff_since_cache — a hash-exact, token-bounded unified diff against the cached version, with a read_diff/reread_file cost recommendation
"Don't index my build output and vendored deps" Directory indexing respects .gitignore by default — anchored at your git project root, or at the indexed folder itself when there's no repo (respect_gitignore = false to opt out)
"Ask a broad question about my code" query combines semantic search, the symbol graph, and text grep into one deduplicated result list
"What files would break if I change this?" impact_radius follows the dependency chain to find affected files; significance_threshold filters noise from common symbols. Works with symbol= for dynamic/runtime hooks and direction= for outgoing or bidirectional traversal
"Which files are tightly coupled?" coupling shows shared symbols with a semantic_score that discounts generic boilerplate; mode=co_consumers catches files imported together
"Find where this function is defined and used" find_references / find_definition walk the symbol graph with a clear verdict: referenced, unreferenced, external, or not found
"Find every line matching a pattern" agent_grep does text/regex search with scope annotation, classification, and a token budget
"Run several operations in one round-trip" batch executes multiple tool calls under a single write lock

For power users

  • Multi-agent safe — multiple AI agents share one index without stepping on each other (document locking, optimistic versioning, conflict detection)
  • Git worktree aware — each worktree gets its own index, with shared coordination across all of them
  • Session management — save and restore agent state between conversations (rollback, pruning, read-history)
  • Self-maintaining — change history and telemetry auto-prune to configurable bounds; doctor reports index health and database growth
  • Many file types — code (12 languages built in), text, Markdown, images, PDFs, audio, video (media types need optional packages)

How Many Tokens Does It Save?

What changed Tokens without Stele Tokens with Stele Savings
Nothing (same code) 10,000 0 100%
A typo fix 10,000 ~60 (the diff) 99%
Edited a few functions 10,000 ~1,000 (the diff) 90%
Rewrote the whole file 10,000 10,000 0%

The diff numbers aren't aspirational: diff_since_cache reconstructs the cached version from stored chunks, verifies it against the file's stored hash, and returns a unified diff the agent reads instead of the file. When the diff would cost more than re-reading (tiny files, total rewrites), it says so explicitly.

Zero Dependencies, By Design

pip install stele-context installs exactly one package: this one. The core runs on Python's standard library alone — the vector index (HNSW), BM25 ranking, Porter stemmer, TOML parser, diffing, and storage are all stdlib or hand-rolled and auditable in-repo.

This is a security posture, not a packaging quirk. Every dependency in an AI-agent toolchain is a supply chain attack surface — a tool your agent runs on every conversation should not pull in a dependency tree you can't read. There is no pickle anywhere (JSON-only serialization), no network code in the cache path, and ~17,000 lines of Python you can audit yourself.

Optional extras add capabilities only if you opt in:

pip install stele-context[tree-sitter]   # Better parsing for 9 more languages
pip install stele-context[image,pdf]     # Image and PDF support
pip install stele-context[performance]   # Faster math with numpy
pip install stele-context[all]           # Everything
Full extras list
Extra What it adds
performance Faster math for search (numpy, msgspec)
tree-sitter Better code parsing for JS/TS, Java, C/C++, Go, Rust, Ruby, PHP
image Index and search images (Pillow)
pdf Extract text from PDFs (pymupdf)
audio Index audio files (librosa)
video Index video keyframes (opencv)
mcp MCP server for Claude Desktop/Code
all All of the above

Python API

You can also use Stele Context directly as a Python library:

from stele_context import Stele

engine = Stele()

# Index your project
result = engine.index_documents(["src/", "README.md"])
print(f"Indexed {result['total_chunks']} chunks")

# Cached read: unchanged files come back from cache,
# changed files come back with a diff
ctx = engine.get_context(["src/main.py"])
for entry in ctx["changed"]:
    print(entry["diff_since_cache"]["diff"])

# Search by meaning or keywords
results = engine.search("authentication logic", top_k=5)
for r in results:
    print(f"{r['document_path']}: {r['content'][:100]}...")

# Check what changed since last time
changes = engine.detect_changes_and_update("my-session")
print(f"{len(changes['modified'])} files changed, {len(changes['new'])} new files")

# Find where a function/class is used
refs = engine.find_references("MyClassName")
print(f"Verdict: {refs['verdict']}")  # referenced, unreferenced, external, or not_found

# What breaks if I change this file?
impact = engine.impact_radius(document_path="src/main.py")
print(f"{impact['affected_files']} files could be affected")

Configuration

Create a .stele-context.toml in your project root to customize behavior:

[stele-context]
chunk_size = 512                # How big each chunk is (in tokens)
skip_dirs = [".git", "node_modules", "dist", "vendor"]

All settings are optional — defaults work well for most projects.

All configuration options
[stele-context]
storage_dir = ".stele-context"   # Where to store the index
chunk_size = 256                 # Target tokens per chunk
max_chunk_size = 4096            # Maximum tokens per chunk
merge_threshold = 0.7            # When to merge similar adjacent chunks
change_threshold = 0.85          # When to consider a chunk "unchanged"
search_alpha = 0.42              # Balance between meaning-based and keyword search
skip_dirs = [".git", "node_modules", "__pycache__"]
respect_gitignore = true         # Skip .gitignore'd files when indexing directories
max_history_entries = 1000       # Auto-prune change history past this bound (0 = unlimited)

You can also set STELE_CONTEXT_STORAGE_DIR as an environment variable, or pass options directly in Python:

engine = Stele(chunk_size=512, skip_dirs=[".git", "node_modules", "dist"])

Priority: Python arguments > .stele-context.toml > environment variables > defaults.

Supported File Types

Built-in (no extra packages needed): .py, .js, .ts, .jsx, .tsx, .java, .cpp, .c, .h, .go, .rs, .rb, .php, .swift, .sh, .sql, .html, .css, .json, .yaml, .toml, .md, .txt, .rst, .csv, .log

With optional packages: Images (.png, .jpg, .gif, etc.), PDFs, audio (.mp3, .wav, etc.), video (.mp4, .avi, etc.)

Troubleshooting

ImportError: No module named 'stele_context' Make sure it's installed: pip install stele-context. If using a virtualenv, activate it first.

MCP server not connecting Use the full path to the binary. Run which stele-context and put that path in your config.

PermissionError when indexing Another agent might be holding a lock. Run the reap_expired_locks action of the document_lock tool to clean up.

diff_exact: false on files indexed by an older version Caches built before v1.4.1 reconstruct diffs on a best-effort basis. Re-index the file once (stele-context index <file>) and diffs become hash-exact from then on. No migration needed.

Tools report an old version or stale package metadata A leftover *.egg-info/ directory in your project root shadows the installed package's metadata for any Python started there. stele-context doctor flags this as a stale_egg_info environment issue — delete the directory or rebuild the package.

FAQ

How do I make Claude Code remember my project between conversations? Install Stele Context and add it as an MCP server (see Quick Start). Once connected, Claude Code can index your project and recall file contents, symbol locations, and code structure across conversations — without re-reading everything.

How does this reduce AI token costs on a large codebase? Unchanged files are served from the local cache at zero read cost, and changed files come back as a unified diff instead of full content. The agent's context window holds what's new, not what it already saw.

Does this work with Cursor, Windsurf, or other AI coding tools? Yes. Stele Context speaks MCP, the standard protocol for agent tools — any MCP-compatible client can use it. There's also an HTTP REST API and a plain Python library for direct integration.

Does it replace grep, file reads, or my editor's search? No — it complements them. Native tools stay best for one-off lookups; Stele adds the memory layer: what was already read, what changed since, and how symbols connect across files. Its agent_grep also auto-caches every file it searches, so searching and caching happen in one step.

Does it need an internet connection or API keys? No. Everything runs locally — local-first by design. No API calls, no cloud, no model downloads, no telemetry.

Is my code safe? Your code never leaves your machine. Zero third-party dependencies means no dependency tree to trust — about 17,000 lines of stdlib-only Python you can read and audit yourself, with JSON-only serialization (no pickle).

Can multiple AI agents use it at the same time? Yes. Built-in document locking, optimistic version tracking, and a conflict log let parallel agents share one index safely — including across git worktrees.

Where is the data stored? In a .stele-context/ folder in your project root. It's a SQLite database plus index files. Each git worktree gets its own, and history tables auto-prune so it doesn't grow unbounded.

How is this different from CLAUDE.md or project memory files? CLAUDE.md gives your agent instructions. Stele Context gives it a searchable index of your entire codebase — every function, every import, every file relationship — plus change tracking, so it can answer "where is this used?" or "what changed since I last read this?" without re-reading anything.

Learn More

Development

pip install -e ".[dev]"
pytest                              # 930+ tests
pytest --cov=stele_context           # With coverage
mypy stele_context/                 # Type checking
ruff check stele_context/           # Linting

Releases

Releases are managed using the stele-context release command. See docs/release-automation.md for the release policy and Grok Build automation details.

Contributing

See CONTRIBUTING.md for guidelines.

License

MIT License — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stele_context-1.4.4.tar.gz (203.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

stele_context-1.4.4-py3-none-any.whl (196.2 kB view details)

Uploaded Python 3

File details

Details for the file stele_context-1.4.4.tar.gz.

File metadata

  • Download URL: stele_context-1.4.4.tar.gz
  • Upload date:
  • Size: 203.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for stele_context-1.4.4.tar.gz
Algorithm Hash digest
SHA256 838009b5e963d13a67e7a05b5023c26c4532e06a25031a7190263e4c55eb676b
MD5 99bf7779ae0244cf57dcbbfba7dbbb85
BLAKE2b-256 bc548e375d00f538cfaa8e51f7b9e3ef32353970548b03a3f680f570051ac054

See more details on using hashes here.

Provenance

The following attestation bundles were made for stele_context-1.4.4.tar.gz:

Publisher: publish.yml on IronAdamant/stele-context

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file stele_context-1.4.4-py3-none-any.whl.

File metadata

  • Download URL: stele_context-1.4.4-py3-none-any.whl
  • Upload date:
  • Size: 196.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for stele_context-1.4.4-py3-none-any.whl
Algorithm Hash digest
SHA256 dad6da209ba15213c82f22e398c753d923098d37959f8cc18ec9a2e246ec17c5
MD5 4da2c9ee5fcbbb41d98ef6ae4c796c32
BLAKE2b-256 1238e517b8fd2414e238e038ae9bcc5bdbd5d57b89026ec147b09d74b7f7f157

See more details on using hashes here.

Provenance

The following attestation bundles were made for stele_context-1.4.4-py3-none-any.whl:

Publisher: publish.yml on IronAdamant/stele-context

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page