Local-first, token-efficient memory system for Claude Code via MCP

These details have not been verified by PyPI

Project links

Project description

Cortex Claude

Local-first, token-efficient memory system for Claude Code via MCP.

What is this?

Cortex Claude gives Claude Code persistent memory through a local MCP server. Unlike other memory solutions that dump everything into context, Cortex uses progressive recall — a 3-layer retrieval system that returns only what's relevant, using the minimum tokens needed.

Save once:

"The auth service uses JWT tokens with 24-hour expiry. Refresh tokens are stored in httpOnly cookies."

Ask later, get back only what matters:

# Layer 1: Facts (cheapest — ~7 tokens each)
auth service → use → jwt tokens
auth service → use → hour expiry

# Layer 2: Summary (~25% of original)
# Layer 3: Full content (only if needed)

Key Features

Progressive recall — 3 layers (facts → summaries → full content), stops at the cheapest sufficient layer
Knowledge graph — auto-extracts structured facts via spaCy NLP with multi-hop traversal
Token efficient — 66%+ fewer tokens vs. full content retrieval (benchmarked)
Local-first — SQLite + local embeddings + local NLP. Zero API calls, zero network, zero cost
Graph traversal — navigate entity connections across multiple hops (A → B → C)
Entity normalization — "postgres", "PostgreSQL", "pg" all resolve to the same entity
Configurable scopes — global, per-project, or custom memory boundaries
Deduplication — detects and merges near-identical memories automatically
Decay system — unused memories lose relevance over time, keeping results fresh
Multi-language — fact extraction and summarization in EN, PT (auto-detected). ES, DE, FR supported with spaCy models
Full-text search — FTS5 keyword search alongside semantic vector search
Fully configurable — all thresholds, ratios, and behaviors customizable via config.json
On-demand — Claude calls memory tools only when needed, nothing auto-injected

Benchmarks

With 10 stored memories (244 total tokens):

Depth	Tokens returned	Reduction	Latency
`facts`	82	66%	~10ms
`auto`	82	66%	~10ms
`full`	244	0%	~12ms

Facts query: 0.1ms. Graph traversal: 0.2ms. Save: ~30ms (after model load).

Quick Start

Install

pip install cortex-claude

Configure Claude Code

Add a .mcp.json to your project root (or ~/.claude.json for global):

{
  "mcpServers": {
    "cortex": {
      "type": "stdio",
      "command": "python",
      "args": ["-m", "cortex_claude"]
    }
  }
}

First run downloads the embedding model (~80MB) and spaCy model (~12MB) automatically.

Use

In any Claude Code session:

"Remember that the API uses rate limiting at 500 req/min"
→ cortex_save stores it, extracts facts, generates embedding

"What do you know about rate limiting?"
→ cortex_recall finds it via progressive recall

"What facts do you have about the API?"
→ cortex_facts returns structured knowledge graph triplets

"What's connected to the auth service?"
→ cortex_traverse follows graph connections across hops

"Forget what I said about the old API key"
→ cortex_forget removes matching memories (with preview first)

"Show me the memory status"
→ cortex_status shows totals, scopes, storage size

Tools

Tool	What it does	Token cost
`cortex_save`	Store memory with auto fact extraction, summarization, and embedding	N/A
`cortex_recall`	Progressive retrieval: facts → summaries → full content	Controlled via `max_tokens` budget
`cortex_facts`	Direct knowledge graph query, returns structured triplets	~5-15 tokens per fact
`cortex_traverse`	Navigate the knowledge graph across multiple hops	~5-15 tokens per connection
`cortex_forget`	Delete memories by query or ID. Dry-run by default (preview before deleting)	N/A
`cortex_scopes`	Manage scopes: list, create, delete, link/unlink directories	N/A
`cortex_status`	Dashboard: memory count, fact count, storage size per scope	N/A

cortex_recall depth modes

Mode	Returns	When to use
`auto`	Starts cheap, escalates if needed	Default — best for most queries
`facts`	Only knowledge graph triplets	Quick lookups, minimal token use
`summaries`	Facts + compressed summaries	Medium detail needed
`full`	All layers including original text	Full context needed

How It Works

Save: content → embedding + fact extraction (spaCy) + summarization → SQLite

Recall (progressive):
  1. Facts layer     (~5-15 tokens/fact)   → sufficient? stop
  2. Summaries layer (~25% of original)    → sufficient? stop
  3. Full chunks     (original content)    → return

Fact extraction uses spaCy dependency parsing and NER to produce subject-relation-object triplets. Runs locally, costs zero tokens. Entities are normalized and deduplicated ("postgres" → "postgresql").

Graph traversal navigates entity connections across multiple hops. Query "auth" and discover: auth → JWT → express-jwt → middleware.

Summarization uses extractive summarization (sentence scoring via TF-IDF + entity density + position). No LLM calls. Multi-language aware (EN/PT).

Deduplication detects near-identical memories (cosine similarity threshold, configurable) and merges them.

Decay — memories that aren't accessed lose relevance over time (score = e^(-λ * days) * (1 + log(access_count))). Recalculated on server startup. Affects ranking in all recall layers.

Hybrid search — combines vector similarity (semantic) + FTS5 (keyword exact match) for best recall. FTS5 synced automatically via SQLite triggers.

Scopes isolate memories per project. Manage via cortex_scopes tool or configure in ~/.cortex-claude/config.json.

Configuration

All behavior is customizable via ~/.cortex-claude/config.json:

{
  "recall": {
    "default_max_tokens": 200,
    "default_depth": "auto",
    "sufficiency": {
      "coverage_threshold": 0.7,
      "confidence_threshold": 0.6
    }
  },
  "embeddings": {
    "model": "all-MiniLM-L6-v2",
    "batch_size": 32
  },
  "facts": {
    "extraction_method": "local",
    "min_confidence": 0.5
  },
  "decay": {
    "lambda": 0.05,
    "recalculate_interval_hours": 6,
    "min_score": 0.01
  },
  "deduplication": {
    "similarity_threshold": 0.92,
    "merge_strategy": "append"
  },
  "scopes": {
    "mappings": {
      "/path/to/project-a": "project:a",
      "/path/to/project-b": "project:b"
    },
    "default": "global",
    "search_order": "project_first"
  },
  "storage": {
    "max_db_size_mb": 500
  }
}

All fields are optional. Defaults are used for anything not specified.

Development

git clone https://github.com/your-user/cortex-claude.git
cd cortex-claude
uv venv --python python3.13
uv sync --all-extras
uv run python -m spacy download en_core_web_sm
uv run pytest

Run the demo:

uv run python scripts/demo.py

Run benchmarks:

uv run python scripts/benchmark.py

Architecture

See ARCHITECTURE.md for the full technical specification.

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.5.0

Apr 25, 2026

0.4.0

Apr 25, 2026

0.3.0

Apr 25, 2026

0.2.0

Apr 25, 2026

This version

0.1.0

Apr 24, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cortex_claude-0.1.0.tar.gz (180.2 kB view details)

Uploaded Apr 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

cortex_claude-0.1.0-py3-none-any.whl (33.1 kB view details)

Uploaded Apr 24, 2026 Python 3

File details

Details for the file cortex_claude-0.1.0.tar.gz.

File metadata

Download URL: cortex_claude-0.1.0.tar.gz
Upload date: Apr 24, 2026
Size: 180.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cortex_claude-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`389034dc41b542949a04415d58a627c09c7e2ec8e8a2a1cf08507cd8ef4ccf03`
MD5	`4dc4e8a98036da6546f22c4b822745da`
BLAKE2b-256	`56338f7ddb2eb605043f3bb5366024a67ebeae3e4bc6701347b71f493eb568c1`

See more details on using hashes here.

Provenance

The following attestation bundles were made for cortex_claude-0.1.0.tar.gz:

Publisher: ci.yml on rafaelaugustos/cortex-claude

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: cortex_claude-0.1.0.tar.gz
- Subject digest: 389034dc41b542949a04415d58a627c09c7e2ec8e8a2a1cf08507cd8ef4ccf03
- Sigstore transparency entry: 1375931587
- Sigstore integration time: Apr 24, 2026
Source repository:
- Permalink: rafaelaugustos/cortex-claude@b6fb810e72f32087b26fde49e89a0803d573872b
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/rafaelaugustos
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: ci.yml@b6fb810e72f32087b26fde49e89a0803d573872b
- Trigger Event: push

File details

Details for the file cortex_claude-0.1.0-py3-none-any.whl.

File metadata

Download URL: cortex_claude-0.1.0-py3-none-any.whl
Upload date: Apr 24, 2026
Size: 33.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cortex_claude-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a5b41acf23abe68c9299eb27dbceffbd075911bfa0d7eaab3083e90a146321d1`
MD5	`85371ba239b2758c48d201f9b6448c68`
BLAKE2b-256	`f154c2ee27b6fadc5ae5e513e814a0b9bb2ff1636af9720784a9c10db8dc727c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for cortex_claude-0.1.0-py3-none-any.whl:

Publisher: ci.yml on rafaelaugustos/cortex-claude

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: cortex_claude-0.1.0-py3-none-any.whl
- Subject digest: a5b41acf23abe68c9299eb27dbceffbd075911bfa0d7eaab3083e90a146321d1
- Sigstore transparency entry: 1375931702
- Sigstore integration time: Apr 24, 2026
Source repository:
- Permalink: rafaelaugustos/cortex-claude@b6fb810e72f32087b26fde49e89a0803d573872b
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/rafaelaugustos
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: ci.yml@b6fb810e72f32087b26fde49e89a0803d573872b
- Trigger Event: push

cortex-claude 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Cortex Claude

What is this?

Key Features

Benchmarks

Quick Start

Install

Configure Claude Code

Use

Tools

cortex_recall depth modes

How It Works

Configuration

Development

Architecture

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance