A Unix-native memory control plane for LLM orchestration
Project description
memctl
A Unix-native memory control plane for LLM orchestration.
One file, one truth. Ingest files, recall with FTS5, pipe into any LLM.
pip install memctl
memctl init
memctl push "project architecture" --source src/ | llm "Summarize the architecture"
echo "The architecture uses event sourcing" | memctl pull --tags arch
Why memctl?
LLMs forget everything between turns. memctl gives them persistent, structured, policy-governed memory backed by a single SQLite file.
- Zero dependencies — stdlib only. No numpy, no torch, no compiled extensions.
- One file — Everything in
memory.db(SQLite + FTS5 + WAL). - Unix composable —
pushwrites to stdout,pullreads from stdin. Pipe freely. - Policy-governed — 30 detection patterns block secrets, injection, and instructional content before storage.
- Content-addressed — SHA-256 dedup ensures idempotent ingestion.
- Forward-compatible — Identical schema to RAGIX. Upgrade seamlessly.
Installation
pip install memctl
For Office/ODF document ingestion (.docx, .odt, .pptx, .odp, .xlsx, .ods):
pip install memctl[docs]
For MCP server support (Claude Code / Claude Desktop):
pip install memctl[mcp]
For everything:
pip install memctl[all]
Requirements: Python 3.10+ (3.12 recommended). No compiled dependencies for core.
PDF extraction requires pdftotext from poppler-utils (sudo apt install poppler-utils or brew install poppler).
Quickstart
1. Initialize a memory workspace
memctl init
# Creates .memory/memory.db, .memory/config.yaml, .memory/.gitignore
Set the environment variable for convenience:
eval $(memctl init)
# Sets MEMCTL_DB=.memory/memory.db
2. Ingest files and recall
# Ingest source files + recall matching items → injection block on stdout
memctl push "authentication flow" --source src/auth/
# Ingest Office documents (requires memctl[docs])
memctl push "project status" --source reports/*.docx slides/*.pptx
# Ingest PDFs (requires pdftotext)
memctl push "specifications" --source specs/*.pdf
# Recall only (no ingestion)
memctl push "database schema"
3. Store LLM output
# Pipe LLM output into memory
echo "We chose JWT for stateless auth" | memctl pull --tags auth,decision --title "Auth decision"
# Or pipe from any LLM CLI
memctl push "API design" | llm "Analyze this" | memctl pull --tags api
4. Search
# Human-readable
memctl search "authentication"
# JSON for scripts
memctl search "database" --json -k 5
5. Inspect a folder (one-liner)
# Auto-mounts, auto-syncs, and inspects — all in one command
memctl inspect docs/
# Same in JSON (for scripts)
memctl inspect docs/ --json
# Skip sync (use cached state)
memctl inspect docs/ --no-sync
inspect auto-mounts the folder if needed, checks staleness, syncs only if stale, and produces a structural summary. All implicit actions are announced on stderr.
6. Manage
memctl show MEM-abc123def456 # Show item details
memctl stats # Store metrics
memctl stats --json # Machine-readable stats
memctl consolidate # Merge similar STM items
memctl consolidate --dry-run # Preview without writing
CLI Reference
memctl <command> [options]
Commands
| Command | Description |
|---|---|
init [PATH] |
Initialize a memory workspace (default: .memory) |
push QUERY [--source ...] |
Ingest files + recall matching items to stdout |
pull [--tags T] [--title T] |
Read stdin, store as memory items |
search QUERY [-k N] |
FTS5 full-text search |
show ID |
Display a single memory item |
stats |
Store statistics |
consolidate [--dry-run] |
Deterministic merge of similar STM items |
loop QUERY --llm CMD |
Bounded recall-answer loop with LLM |
mount PATH |
Register a folder as a structured source |
sync [PATH] |
Delta-sync mounted folders into the store |
inspect [PATH] |
Structural inspection with auto-mount and auto-sync |
serve |
Start MCP server (requires memctl[mcp]) |
Global Flags
| Flag | Description |
|---|---|
--db PATH |
SQLite database path |
--json |
Machine-readable JSON output |
-q, --quiet |
Suppress stderr progress messages |
-v, --verbose |
Enable debug logging |
Command Details
memctl init
memctl init [PATH] [--force] [--fts-tokenizer fr|en|raw]
Creates the workspace directory, SQLite database with schema, config.yaml, and .gitignore. Prints export MEMCTL_DB="..." to stdout for eval.
Idempotent: running twice on the same path exits 0 without error.
memctl push
memctl push QUERY [--source FILE ...] [--budget N] [--tier TIER] [--tags T] [--scope S]
Two-phase command:
- Ingest (optional): processes
--sourcefiles with SHA-256 dedup and paragraph chunking. - Recall: FTS5 search for QUERY, format matching items as an injection block on stdout.
stdout contains only the injection block (format_version=1). Progress goes to stderr.
memctl pull
echo "..." | memctl pull [--tags T] [--title T] [--scope S]
Reads text from stdin and stores it as memory items. Attempts structured proposal extraction first; falls back to single-note storage. All content passes through the policy engine before storage.
memctl search
memctl search QUERY [--tier TIER] [--type TYPE] [-k N] [--json]
FTS5 full-text search. Returns human-readable output by default, or JSON with --json.
memctl consolidate
memctl consolidate [--scope S] [--dry-run] [--json]
Deterministic consolidation: clusters STM items by type + tag overlap (Jaccard), merges each cluster (longest content wins), promotes to MTM. High-usage MTM items promote to LTM. No LLM calls.
memctl loop
memctl push "question" | memctl loop "question" --llm "claude -p" [--max-calls 3] [--protocol json]
Bounded recall-answer loop: sends context + question to an external LLM, parses its response for refinement directives, performs additional recalls from the memory store, and detects convergence. The LLM is never autonomous — it only proposes queries. The controller enforces bounds, dedup, and stopping conditions.
Protocol: The LLM must output a JSON first line: {"need_more": bool, "query": "...", "stop": bool}, followed by its answer. Supported protocols: json (default), regex, passive (single-pass, no refinement).
Stopping conditions:
llm_stop— LLM setsstop: truefixed_point— consecutive answers are similar above threshold (default 0.92)query_cycle— LLM re-requests a query already triedno_new_items— recall returns no new items for the proposed querymax_calls— iteration limit reached (default 3)
Flags:
| Flag | Default | Description |
|---|---|---|
--llm CMD |
(required) | LLM command (e.g. "claude -p", "ollama run granite3.1:2b") |
--llm-mode |
stdin |
How to pass the prompt: stdin or file |
--protocol |
json |
LLM output protocol: json, regex, passive |
--system-prompt |
(auto) | Custom system prompt (text or file path) |
--max-calls |
3 |
Maximum LLM invocations |
--threshold |
0.92 |
Answer fixed-point similarity threshold |
--query-threshold |
0.90 |
Query cycle similarity threshold |
--stable-steps |
2 |
Consecutive stable steps for convergence |
--no-stop-on-no-new |
off | Continue even if recall returns no new items |
--budget |
2200 |
Token budget for context |
--trace |
off | Emit JSONL trace to stderr |
--trace-file |
(none) | Write JSONL trace to file |
--strict |
off | Exit 1 if max-calls reached without convergence |
--timeout |
300 |
LLM subprocess timeout (seconds) |
--replay FILE |
(none) | Replay a trace file (no LLM calls) |
Example pipeline:
# Iterative recall with Claude
memctl push "How does authentication work?" --source docs/ \
| memctl loop "How does authentication work?" --llm "claude -p" --trace
# Sovereign local LLM
memctl push "database schema" --source src/ \
| memctl loop "database schema" --llm "ollama run granite3.1:2b" --protocol json
# Replay a trace (no LLM needed)
memctl loop --replay trace.jsonl "original question"
memctl mount
memctl mount PATH [--name NAME] [--ignore PATTERN ...] [--lang HINT]
memctl mount --list
memctl mount --remove ID_OR_NAME
Registers a folder as a structured source. Stores metadata only — no scanning, no ingestion. The folder contents are synced separately via sync or automatically via inspect.
memctl sync
memctl sync [PATH] [--full] [--json] [--quiet]
Delta-syncs mounted folders into the memory store. Uses a 3-tier delta rule:
- New file (not in DB) → ingest
- Size + mtime match → fast skip (no hashing)
- Hash compare → ingest only if content changed
If PATH is given but not yet mounted, it is auto-registered first. --full forces re-processing of all files.
memctl inspect
# Orchestration mode — auto-mounts, auto-syncs, and inspects
memctl inspect PATH [--sync auto|always|never] [--no-sync] [--mount-mode persist|ephemeral]
[--budget N] [--ignore PATTERN ...] [--json] [--quiet]
# Classic mode — inspect an existing mount by ID/name
memctl inspect --mount ID_OR_NAME [--budget N] [--json] [--quiet]
When given a positional PATH, inspect operates in orchestration mode:
- Auto-mount — registers the folder if not already mounted
- Staleness check — compares disk inventory (path/size/mtime triples) against the store
- Auto-sync — runs delta sync only if stale (or always/never per
--sync) - Inspect — generates a deterministic structural summary
Output includes file/chunk/size totals, per-folder breakdown, per-extension distribution, top-5 largest files, and rule-based observations. All paths in output are mount-relative (never absolute).
--mount-mode ephemeral removes the mount record after inspection (corpus data is preserved). --no-sync is shorthand for --sync never.
All implicit actions (mount, sync) are announced on stderr. --quiet suppresses them.
Environment Variables
| Variable | Default | Description |
|---|---|---|
MEMCTL_DB |
.memory/memory.db |
Path to SQLite database |
MEMCTL_BUDGET |
2200 |
Token budget for injection blocks |
MEMCTL_FTS |
fr |
FTS tokenizer preset (fr/en/raw) |
MEMCTL_TIER |
stm |
Default write tier |
MEMCTL_SESSION |
(unset) | Session ID for audit provenance |
Precedence: CLI --flag > MEMCTL_* env var > compiled default. Always.
Exit Codes
| Code | Meaning |
|---|---|
| 0 | Success (including idempotent no-op) |
| 1 | Operational error (bad args, empty input, policy rejection) |
| 2 | Internal failure (unexpected exception, I/O error) |
Shell Integration
Add to .bashrc, .zshrc, or your project's env.sh:
export MEMCTL_DB=.memory/memory.db
# Shortcuts
meminit() { memctl init "${1:-.memory}"; }
memq() { memctl push "$1"; } # recall only
memp() { memctl push "$1" ${2:+--source "$2"}; } # push with optional source
mempull() { memctl pull --tags "${1:-}" ${2:+--title "$2"}; }
Pipe Recipes
# Ingest docs + recall + feed to LLM + store output
memctl push "API design" --source docs/ | llm "Summarize" | memctl pull --tags api
# Search and pipe to jq
memctl search "auth" --json | jq '.[].title'
# Batch ingest a directory
memctl push "project overview" --source src/ tests/ docs/ -q
# Export all items as JSONL
memctl search "" --json | jq -c '.[]'
# Iterative recall-answer loop with trace
memctl push "auth flow" --source docs/ | memctl loop "auth flow" --llm "claude -p" --trace
# One-liner: inspect a folder (auto-mount + auto-sync)
memctl inspect docs/
# Inspect in JSON, pipe to jq for extension breakdown
memctl inspect src/ --json | jq '.extensions'
# Inspect without syncing (use cached state)
memctl inspect docs/ --no-sync --json
MCP Server
memctl exposes 7 MCP tools for integration with Claude Code, Claude Desktop, VS Code, and any MCP-compatible client.
Start the Server
memctl serve --db .memory/memory.db
# or
python -m memctl.mcp.server --db .memory/memory.db
Claude Code Integration
Add to .claude/settings.json:
{
"mcpServers": {
"memctl": {
"command": "memctl",
"args": ["serve", "--db", ".memory/memory.db"]
}
}
}
MCP Tools
| Tool | Description |
|---|---|
memory_recall |
Token-budgeted context injection (primary tool) |
memory_search |
Interactive FTS5 discovery |
memory_propose |
Store findings with policy governance |
memory_write |
Direct write (privileged/dev operations) |
memory_read |
Read items by ID |
memory_stats |
Store metrics |
memory_consolidate |
Trigger deterministic merge |
Tool names use the memory_* prefix for drop-in compatibility with RAGIX.
How It Works
Architecture
memctl/
├── types.py Data model (MemoryItem, MemoryProposal, MemoryEvent, MemoryLink)
├── store.py SQLite + FTS5 + WAL backend (10 tables + schema_meta)
├── extract.py Text extraction (text files + binary format dispatch)
├── ingest.py Paragraph chunking, SHA-256 dedup, source resolution
├── policy.py Write governance (30 patterns: secrets, injection, instructional)
├── config.py Dataclass configuration
├── similarity.py Stdlib text similarity (Jaccard + SequenceMatcher)
├── loop.py Bounded recall-answer loop controller
├── mount.py Folder mount registration and management
├── sync.py Delta sync with 3-tier change detection
├── inspect.py Structural inspection and orchestration
├── cli.py 12 CLI commands
├── consolidate.py Deterministic merge (Jaccard clustering, no LLM)
├── proposer.py LLM output parsing (delimiter + regex)
└── mcp/
├── tools.py 7 MCP tools (memory_* prefix)
├── formatting.py Injection block format (format_version=1)
└── server.py FastMCP server entry point
19 source files. ~7,300 lines. Zero compiled dependencies for core.
Memory Tiers
| Tier | Purpose | Lifecycle |
|---|---|---|
| STM (Short-Term) | Recent observations, unverified facts | Created by pull. Consolidated or expired. |
| MTM (Medium-Term) | Verified, consolidated knowledge | Created by consolidate. Promoted by usage. |
| LTM (Long-Term) | Stable decisions, definitions, constraints | Promoted from MTM by usage count or type. |
Policy Engine
Every write path passes through the policy engine. No exceptions.
Hard blocks (rejected):
- 10 secret detection patterns (API keys, tokens, passwords, private keys, JWTs)
- 8 injection patterns (prompt override, system prompt fragments)
- 8 instructional block patterns (tool invocation syntax, role fragments)
- Oversized content (>2000 chars for non-pointer types)
Soft blocks (quarantined to STM with expiry):
- 4 instructional quarantine patterns (imperative self-instructions)
- Missing provenance or justification
- Quarantined items stored with
injectable=False
FTS5 Tokenizer Presets
| Preset | Tokenizer | Use Case |
|---|---|---|
fr |
unicode61 remove_diacritics 2 |
French-safe default (accent normalization) |
en |
porter unicode61 remove_diacritics 2 |
English with Porter stemming |
raw |
unicode61 |
No diacritics removal, no stemming |
Expert override: memctl init --fts-tokenizer "porter unicode61 remove_diacritics 2"
Supported Formats
| Category | Extensions | Requirement |
|---|---|---|
| Text / Markup | .md .txt .rst .csv .tsv .html .xml .json .yaml .toml |
None (stdlib) |
| Source Code | .py .js .ts .jsx .tsx .java .go .rs .c .cpp .sh .sql .css … |
None (stdlib) |
| Office Documents | .docx .odt |
pip install memctl[docs] |
| Presentations | .pptx .odp |
pip install memctl[docs] |
| Spreadsheets | .xlsx .ods |
pip install memctl[docs] |
.pdf |
pdftotext (poppler-utils) |
All formats are extracted to plain text before chunking and ingestion. Binary format libraries are lazy-imported — a missing library produces a clear ImportError with install instructions.
Content Addressing
Every ingested file is hashed (SHA-256). Re-ingesting the same file is a no-op. Every memory item stores a content_hash for deduplication.
Consolidation
Deterministic, no-LLM merge pipeline:
- Collect non-archived STM items
- Cluster by type + tag overlap (Jaccard similarity)
- Merge each cluster: longest content wins; tie-break by earliest
created_at, then lexicographic ID - Write merged items at MTM tier +
supersedeslinks - Archive originals (
archived=True) - Promote high-usage MTM items to LTM
Database Schema
Single SQLite file with WAL mode. 10 tables + 1 FTS5 virtual table:
| Table | Purpose |
|---|---|
memory_items |
Core memory items (22 columns) |
memory_revisions |
Immutable revision history |
memory_events |
Audit log (every read/write/consolidate) |
memory_links |
Directional relationships (supersedes, supports, etc.) |
memory_embeddings |
Reserved for RAGIX (empty in memctl) |
corpus_hashes |
SHA-256 file dedup + mount metadata (mount_id, rel_path, ext, size_bytes, mtime_epoch, lang_hint) |
corpus_metadata |
Corpus-level metadata |
schema_meta |
Schema version, creation info |
memory_palace_locations |
Reserved for RAGIX |
memory_mounts |
Registered folder mounts (path, name, ignore patterns, lang hint) |
memory_items_fts |
FTS5 virtual table for full-text search |
Schema version is tracked in schema_meta. Current: SCHEMA_VERSION=2. Migration from v1 is additive (ALTER TABLE ADD COLUMN) and idempotent.
Migration to RAGIX
memctl is extracted from RAGIX and maintains schema-identical databases. To upgrade:
git clone git@github.com:ovitrac/RAGIX.git
cd RAGIX
pip install -e .[all]
# Point at the same database — all items carry over
ragix memory stats --db /path/to/your/.memory/memory.db
| Feature | memctl | RAGIX |
|---|---|---|
| SQLite schema | Forward-compatible (RAGIX can open memctl DBs) | Superset |
| Injection format | format_version=1 |
format_version=1 |
| MCP tool names | memory_* |
memory_* |
| FTS5 recall | Yes | Yes (+ hybrid embeddings) |
| Folder mount + sync | Yes (v0.3+) | No |
| Embeddings | No | Yes (FAISS + Ollama) |
| LLM-assisted merge | No | Yes |
| Graph-RAG | No | Yes |
| Reporting | No | Yes |
Python API
from memctl import MemoryStore, MemoryItem, MemoryPolicy
# Open or create a store
store = MemoryStore(db_path=".memory/memory.db")
# Write an item
item = MemoryItem(
title="Architecture decision",
content="We chose event sourcing for state management",
tier="stm",
type="decision",
tags=["architecture", "event-sourcing"],
)
store.write_item(item, reason="manual")
# Search
results = store.search_fulltext("event sourcing", limit=10)
for r in results:
print(f"[{r.tier}] {r.title}: {r.content[:80]}")
# Policy check
policy = MemoryPolicy()
from memctl.types import MemoryProposal
proposal = MemoryProposal(
title="Config", content="Some content",
why_store="Important finding",
provenance_hint={"source_kind": "doc", "source_id": "design.md"},
)
verdict = policy.evaluate_proposal(proposal)
print(verdict.action) # "accept", "quarantine", or "reject"
store.close()
Testing
pip install memctl[dev]
pytest tests/ -v
479 tests across 14 test files covering types, store, policy, ingest, text extraction, similarity, loop controller, mount, sync, inspect, forward compatibility, contracts, CLI (subprocess), and pipe composition.
License
MIT License. See LICENSE for details.
Author: Olivier Vitrac, PhD, HDR | olivier.vitrac@adservio.fr | Adservio Innovation Lab
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file memctl-0.3.0.tar.gz.
File metadata
- Download URL: memctl-0.3.0.tar.gz
- Upload date:
- Size: 117.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
df49c8076a913165847588cd384325f80f80eb8cb0b0eaebdb1ce75ebb26f423
|
|
| MD5 |
3f7ea7fe11362721b63c43f04c30146f
|
|
| BLAKE2b-256 |
8dfd453e7280d128b63582ceec2bb7caaa64831755f594636289e53cd568e53e
|
Provenance
The following attestation bundles were made for memctl-0.3.0.tar.gz:
Publisher:
workflow.yml on ovitrac/memctl
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
memctl-0.3.0.tar.gz -
Subject digest:
df49c8076a913165847588cd384325f80f80eb8cb0b0eaebdb1ce75ebb26f423 - Sigstore transparency entry: 973184291
- Sigstore integration time:
-
Permalink:
ovitrac/memctl@03f98f83ce728509a0922d717fcc11989f3df35e -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/ovitrac
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@03f98f83ce728509a0922d717fcc11989f3df35e -
Trigger Event:
push
-
Statement type:
File details
Details for the file memctl-0.3.0-py3-none-any.whl.
File metadata
- Download URL: memctl-0.3.0-py3-none-any.whl
- Upload date:
- Size: 81.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1110e22958af8d115fc6d90abf66cab1718091ec69e18889052f4631d162d03a
|
|
| MD5 |
076431db3389def399181ccce37c2082
|
|
| BLAKE2b-256 |
065f279b33176ec0da79079055d2b704100c8b6d383d7ba63fcbae55ec4ae673
|
Provenance
The following attestation bundles were made for memctl-0.3.0-py3-none-any.whl:
Publisher:
workflow.yml on ovitrac/memctl
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
memctl-0.3.0-py3-none-any.whl -
Subject digest:
1110e22958af8d115fc6d90abf66cab1718091ec69e18889052f4631d162d03a - Sigstore transparency entry: 973184293
- Sigstore integration time:
-
Permalink:
ovitrac/memctl@03f98f83ce728509a0922d717fcc11989f3df35e -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/ovitrac
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@03f98f83ce728509a0922d717fcc11989f3df35e -
Trigger Event:
push
-
Statement type: