Lossless Context Management for Recursive Language Model

These details have not been verified by PyPI

Project description

MemoryForge

MemoryForge is an API-first, local-first memory layer for long-running LLM workflows.

It stores durable project evidence in SQLite, keeps live context bounded, and returns source-backed context bundles to the model or agent the user is already running. The primary integration is the hookless MemoryForgeSession API: build context before a turn, call your model or agent, then record the completed turn. Codex MCP remains supported as an adapter, but Codex hooks are optional.

User/App -> MemoryForgeSession -> your LLM or agent
              |
              +-> SQLite memory.db
              +-> bounded CoreContextBundle
              +-> RLM/LTM retrieval
              +-> LCM compaction checks

Codex CLI -> MemoryForge MCP -> same SQLite memory.db

What It Is For

MemoryForge focuses on long-form project evidence that is common in real coding workflows but too large or noisy to paste into every prompt:

design notes and architecture documents
requirements and implementation plans
setup guides and decision records
benchmark descriptions and experiment logs
large Markdown files used while "vibe coding" or building a project over many sessions

These sources are ingested through RLM, stored as durable LTM evidence, and recalled as bounded context when Codex needs them. The shallow path is the RLM sub-agent analysis/summary stored in LTM; the deep path remains the lossless rlm_chunk:<id> content that can be rehydrated on demand.

MemoryForge intentionally avoids a large AST/code graph schema in the core release. The default auto-load path targets Markdown project knowledge (README, design notes, ADRs, plans, reports, specs). Code files can still be ingested manually, but dedicated code indexing is future work rather than part of the current SQLite schema.

Memory Layers

Layer	Role	Model worker use
RLM	Raw Large Memory. Chunks large files/prompts, can run Codex sub-agent analysis, and indexes both derived summaries and full chunks into durable memory.	Explicit through `rlm-run`.
LTM	Long-Term Memory. Recalls durable evidence across sessions and sources.	No model call.
LCM	Lossless Context Management. Keeps the MemoryForge active-session view bounded with summaries, raw refs, and recoverable tool-output parts.	Built into `MemoryForgeSession`; optional worker for compaction.

Important boundary: LCM compacts MemoryForge's SQLite-backed active context. It does not directly erase Codex's own context window. Codex manages its live thread and can compact it with /compact; MemoryForge MCP tools and optional Codex hooks preserve and rehydrate the evidence needed after compaction.

Install

From PyPI in a project that uses uv:

uv add memfg==6.1.3

MemoryForge defaults to lexical BM25/FTS recall so first run stays responsive on fresh PyPI installs and offline machines. Semantic/vector recall is available when explicitly enabled with MEMORYFORGE_VECTOR_BACKEND=fastembed.

For local development from this repository:

uv sync --extra dev --extra benchmark

PyPI 6.1.3 Post-Publish Smoke Test

After publishing:

uvx --from twine twine upload dist/memfg-6.1.3*

use a clean project to verify the package from PyPI. The flow below is written for Windows PowerShell, but the same commands work in any shell after adapting path syntax.

Create a fresh uv project outside this repository:

New-Item -ItemType Directory -Force C:\tmp\memoryforge-pypi-smoke | Out-Null
Set-Location C:\tmp\memoryforge-pypi-smoke
uv init --bare --name memoryforge-pypi-smoke .

Install the freshly published PyPI package:

uv add memfg==6.1.3
uv run memoryforge --help

uv run memoryforge --help must print CLI usage and exit. Do not use uv run memoryforge-mcp as a normal smoke test; that command starts the MCP stdio server and waits for a client, so an idle terminal there is expected.

codex mcp remove memoryforge
codex mcp add memoryforge -- uv run memoryforge-mcp
codex mcp get memoryforge --json

If codex mcp remove memoryforge says the server does not exist, continue with the codex mcp add command. Restart Codex after changing MCP registration.

Initialize MemoryForge in the project:

uv run memoryforge init . --agent-id codex --force

Expected behavior for 6.1.3:

The command exits by itself.
.memoryforge\memory.db is created.
.memoryforge\config.json is created.
AGENTS.md is created or updated.
The JSON output shows "indexed": {"enabled": false, ...}.
The JSON output shows "codex /init not requested" unless you passed --configure-codex.

init is intentionally lightweight. It does not index Markdown by default, and it does not call Codex CLI subprocesses by default.

Add a small Markdown memory file:

New-Item -ItemType Directory -Force docs | Out-Null
@'
# Facilities telemetry

Facilities use telemetry ingress endpoint https://telemetry.facilities.example.com/v2/ingest.

The old endpoint https://telemetry.old.example.com/ingest was rejected because it bypassed tenant isolation and failed TLS pinning.
'@ | Set-Content -Encoding UTF8 docs\telemetry.md

Index the Markdown file through the full RLM pipeline. This is the explicit sub-agent path; init stays lightweight.

uv run memoryforge index . `
  --agent-id codex `
  --runner codex `
  --model gpt-5.4 `
  --max-files 1 `
  --batch-size 1 `
  --max-workers 1 `
  --force

For a no-model local smoke test, replace --runner codex with --runner mock. Real validation on Windows should be run from PowerShell, because the Codex account/config used by PowerShell can differ from WSL.

init --index remains accepted for compatibility, but the recommended command is memoryforge index.

Verify recall without Codex:

uv run memoryforge --db .memoryforge\memory.db recall-memory `
  --agent-id codex `
  --query "telemetry ingress endpoint facilities old value rejected" `
  --include-content

The output should include https://telemetry.facilities.example.com/v2/ingest and the rejection reason about tenant isolation and TLS pinning.

Verify the hookless session API. This path does not use Codex hooks, MCP, or uv run memoryforge-mcp; it is the primary LCM workflow:

@'
from memoryforge import MemoryForgeSession

with MemoryForgeSession.open(
    db_path=".memoryforge/memory.db",
    agent_id="codex",
    session_id="pypi-hookless-demo",
    system_prompt="Use MemoryForge project memory.",
) as session:
    bundle = session.context_for_next_turn(
        "What is the telemetry ingress endpoint used by facilities?",
        include_content=True,
    )
    print("context messages:", len(bundle.messages))

    result = session.record(
        "What is the telemetry ingress endpoint used by facilities?",
        "Facilities use https://telemetry.facilities.example.com/v2/ingest. "
        "The old endpoint was rejected because it bypassed tenant isolation and failed TLS pinning.",
        tool_outputs=[{
            "tool_name": "manual-check",
            "tool_call_id": "tool-1",
            "output": "Read docs/telemetry.md",
        }],
    )
    print(result.to_dict())

    print(session.messages())
'@ | Set-Content -Encoding UTF8 smoke_session.py

uv run python smoke_session.py

Expected behavior:

The script exits by itself.
The printed turn_ids list has 3 IDs: user, tool output, assistant.
session.messages() shows an assistant message whose first part has part_type: "tool" and content Read docs/telemetry.md.

Inspect the same session through the LCM observability CLI:

uv run memoryforge --db .memoryforge\memory.db lcm-messages `
  --session-id pypi-hookless-demo `
  --agent-id codex `
  --include-content

uv run memoryforge --db .memoryforge\memory.db lcm-context `
  --session-id pypi-hookless-demo

Optional: verify recall through Codex MCP. Start Codex from the same project directory:

codex

Ask:

What is the telemetry ingress endpoint used by facilities, and why was the old value rejected?

Expected Codex behavior:

It should call memoryforge.recall_memory.
The tool call should return quickly for this small project.
The answer should cite the new endpoint and explain why the old value was rejected.

If Codex does not call MemoryForge, ask explicitly:

Use MemoryForge MCP recall_memory first. What is the telemetry ingress endpoint used by facilities, and why was the old value rejected?

Optional LCM lifecycle capture for Codex interactive mode. Leave this off for the first install smoke test. Enable it only after the MCP recall path works:

uv run memoryforge init . --agent-id codex --force --install-hooks

Expected files:

.codex\hooks.json
.memoryforge\hooks\memoryforge-hook.cmd

Restart Codex from the same project directory:

codex

Then run /hooks in Codex and trust the MemoryForge hook definitions. Codex requires this review for project-local command hooks. The MemoryForge hook is intentionally small: it calls python -m memoryforge.cli.main hook ... through the project Python when available, uses uv run --no-sync as a fallback, and has a 10 second timeout. It does not call codex, does not start a model worker, does not sync/reinstall the project, and does not index the whole project on startup unless MEMORYFORGE_HOOK_AUTO_INDEX=1 is set.

The hook path listens for:

SessionStart: cleans stale pending hook files. It does not auto-index Markdown by default.
UserPromptSubmit: stores the pending user prompt and records an LCM context snapshot.
PostToolUse: stores tool output as an assistant message with a tool part when Codex supplies tool output in the hook payload.
Stop: commits the completed turn; if Codex supplies assistant output, it stores user + tool + assistant, otherwise it still commits the user prompt so the session is not empty.
PreCompact and PostCompact: record context snapshots around Codex /compact; PostCompact also stores a compact summary if Codex supplies one.

Ask Codex one real question, then inspect the MemoryForge LCM database:

uv run memoryforge --db .memoryforge\memory.db lcm-sessions `
  --agent-id codex

uv run memoryforge --db .memoryforge\memory.db lcm-messages `
  --session-id <session-id-from-lcm-sessions> `
  --agent-id codex `
  --include-content

uv run memoryforge --db .memoryforge\memory.db lcm-context `
  --session-id <session-id-from-lcm-sessions>

If lcm-sessions shows message_count: 0, the hook was not trusted, Codex was not restarted after installing hooks, or Codex did not run from the initialized project directory.

Optional semantic vector recall. Leave this off for the first smoke test. To enable FastEmbed later, set the environment before indexing and before starting Codex:

$env:MEMORYFORGE_VECTOR_BACKEND='fastembed'
$env:MEMORYFORGE_VECTOR_MODEL='BAAI/bge-small-en-v1.5'

Without those variables, 6.1.3 uses lexical BM25/FTS recall by default so first run stays responsive on Windows and offline machines.

Primary Hookless Session API

MemoryForgeSession is the primary API. It is intentionally close to mnesis' BYO-LLM path: MemoryForge does not need shell hooks because the caller records the completed turn explicitly.

from memoryforge import MemoryForgeSession

with MemoryForgeSession.open(
    db_path=".memoryforge/memory.db",
    agent_id="codex",
    session_id="session-1",
    system_prompt="Use project memory and cite durable evidence.",
) as session:
    # 1. Build active context before the model/agent call.
    model_payload = session.model_payload_for_next_turn(
        "What endpoint do facilities use for telemetry?",
        include_content=True,
    )
    model_messages = model_payload["messages"]

    # 2. Call your model, SDK, local agent, or Codex wrapper with model_messages.
    assistant_text = (
        "Facilities use https://telemetry.facilities.example.com/v2/ingest."
    )

    # 3. Record the completed turn after the answer exists.
    result = session.record(
        "What endpoint do facilities use for telemetry?",
        assistant_text,
        tool_outputs=[
            {
                "tool_name": "docs-search",
                "tool_call_id": "tool-1",
                "output": "Matched docs/telemetry.md",
            }
        ],
    )
    print(result.to_dict())

This stores the user message, assistant answer, and tool output in the LCM message tables, indexes the completed turn into LTM, and runs a safe compaction check. Tool output is stored as an assistant message with part_type="tool" so existing LCM/RLM/LTM logic stays compatible.

Inspect a hookless session:

uv run memoryforge --db .memoryforge/memory.db lcm-messages \
  --session-id session-1 \
  --agent-id codex \
  --include-content

uv run memoryforge --db .memoryforge/memory.db lcm-context \
  --session-id session-1

Important: if you type directly into Codex CLI interactive mode, MemoryForge cannot see the completed turn unless Codex calls an MCP tool or you enable the optional Codex hook adapter. Hookless lossless capture works when your app, script, SDK wrapper, or future memoryforge codex wrapper routes the lifecycle through MemoryForgeSession.

Optional Codex MCP Adapter

For Codex CLI recall/context tools, register the MemoryForge MCP server:

codex mcp add memoryforge -- uv run memoryforge-mcp

Then run MemoryForge init at the project root:

uv run memoryforge init . --agent-id codex --force

This creates:

.memoryforge/memory.db
.memoryforge/config.json
AGENTS.md

During init, MemoryForge creates or updates the root AGENTS.md with a guarded MemoryForge instruction block:

<!-- MemoryForge instructions start -->
...
<!-- MemoryForge instructions end -->

MemoryForge does not create project-local .codex/ files or install Codex hooks by default. It also does not call Codex CLI subprocesses unless you pass --configure-codex or explicitly run memoryforge index --runner codex. The MCP adapter intentionally exposes only the hot-path tools:

recall_memory: fast factual recall from durable RLM/LTM indexes
build_context_bundle: grounded LCM/LTM context assembly for the active model

Project indexing and RLM worker analysis are explicit CLI/API operations, not MCP tools. This keeps Codex from seeing low-level ingestion tools or launching sub-agents accidentally while answering.

Optional Codex Hook Adapter

Use hooks only if you need Codex interactive mode to auto-capture prompts, tool outputs, and stop events. Hooks are not the primary MemoryForge API.

uv run memoryforge init . --agent-id codex --force --install-hooks

This creates .codex/hooks.json plus a tiny hook runner under .memoryforge/hooks/. After restarting Codex, run /hooks and trust the MemoryForge hook definitions. Without that trust step, Codex will skip project-local command hooks.

LCM lifecycle capture is additive. RLM/LTM ingestion and recall keep working the same way; hooks only append active-session turns and context snapshots into the LCM tables.

Basic Usage

Default project usage is MemoryForgeSession for hookless lifecycle capture. The CLI commands below are the direct/manual surface for indexing, recall, context inspection, and maintenance.

Ingest a long Markdown file or project document:

uv run memoryforge --db .memoryforge/memory.db ingest-file docs/notes.md \
  --agent-id codex

Run project Markdown through the full RLM indexing path:

uv run memoryforge index . \
  --agent-id codex \
  --runner codex \
  --model gpt-5.4

For targeted/debug usage, rlm-run still exists as an advanced CLI command:

uv run memoryforge --db .memoryforge/memory.db rlm-run docs/design.md \
  --agent-id codex \
  --name design-notes \
  --runner codex \
  --model gpt-5.4 \
  --project-root .

The primary index path chunks sources losslessly, runs RLM sub-agent analysis, stores rlm_analysis/rlm_summary rows in LTM, and preserves exact rlm_chunk:<id> refs for deep rehydration.

Recall durable evidence:

uv run memoryforge --db .memoryforge/memory.db recall-memory \
  --agent-id codex \
  --query "why did we choose sqlite"

Build a runtime context bundle for the active Codex project:

uv run memoryforge --db .memoryforge/memory.db runtime-context \
  --agent-id codex \
  --session-id session-1 \
  --query "what context should Codex use now" \
  --project-root .

Run LCM compaction over MemoryForge's stored active context:

uv run memoryforge --db .memoryforge/memory.db lcm-compact \
  --agent-id codex \
  --session-id session-1 \
  --project-root . \
  --force

Inspect the active LCM state before or after compaction:

uv run memoryforge --db .memoryforge/memory.db lcm-sessions \
  --agent-id codex

uv run memoryforge --db .memoryforge/memory.db lcm-messages \
  --session-id session-1 \
  --agent-id codex \
  --include-content

uv run memoryforge --db .memoryforge/memory.db lcm-summary \
  --session-id session-1

Run the MCP server directly:

uv run memoryforge-mcp

Vector Recall

MemoryForge uses lexical BM25/FTS recall by default. To enable semantic vector recall with FastEmbed and store local embeddings in vec_index, configure:

export MEMORYFORGE_VECTOR_BACKEND=fastembed
export MEMORYFORGE_VECTOR_MODEL=BAAI/bge-small-en-v1.5

For explicit lexical-only fallback, set MEMORYFORGE_VECTOR_BACKEND=disabled. The project intentionally keeps one vector cache table, vec_index, and avoids SQLite extension backends such as sqlite-vec in the core release. This keeps the package easier to install, test, and publish.

Retrieval is hybrid by design: vector recall and lexical recall can both contribute candidates, and MemoryForge fuses bounded evidence for the runtime context instead of relying on a vector-only path.

CLI Surface

Public commands:

Project/runtime: init, mcp-server, runtime-context
Conversation memory: store-session, search, recall-memory, active-recall, long-term-source
Contradictions: record-contradiction, find-contradictions
LCM: lcm-context, lcm-sessions, lcm-messages, lcm-summary, lcm-compact, lcm-maintain
RLM/source loading: ingest-file, rlm-load, rlm-search, rlm-chunk-get, dispatch, context-get, rlm-record, aggregate, rlm-run
Diagnostics: chunk, benchmark

memoryforge hook remains available as an internal endpoint for direct testing. RLM/LCM sub-agents are internal MemoryForge workers. For real worker runs, MemoryForge uses Codex CLI through codex exec when configured. Development-time Codex host subagents are separate review/triage helpers and are not the MemoryForge runtime worker API.

Benchmarks

The current benchmark focus is long-memory behavior, not static code indexing:

LoCoMo
LongMemEval
deterministic multi-session stress benchmark
synthetic smoke benchmark

Example smoke check:

uv run python benchmarks/synthetic_test.py

Stress check for many real SQLite sessions:

uv run python benchmarks/stress_sessions.py \
  --sessions 100 \
  --turns-per-session 12 \
  --output benchmarks/results/stress_sessions_100x12.json

Real LoCoMo and LongMemEval runs require their datasets and model credentials. See docs/BENCHMARKS.md for run modes and result fields.

Development

Run the normal quality gate on Unix-like shells:

make check

Equivalent commands:

uv run ruff check memoryforge tests benchmarks
PYTHONDONTWRITEBYTECODE=1 uv run mypy memoryforge
PYTHONDONTWRITEBYTECODE=1 PYTHONPATH=. uv run pytest --ignore=tests/test_real_subagents.py --cov=memoryforge --cov-report=term-missing --cov-fail-under=77
MEMORYFORGE_REAL_SUBAGENT=1 MEMORYFORGE_REAL_PROJECT_ROOT="$PWD" MEMORYFORGE_SUBAGENT_RUNNER=codex MEMORYFORGE_MODEL=gpt-5.4 uv run pytest tests/test_real_subagents.py -vv
uv build
uv run twine check dist/*

On Windows PowerShell, keep pytest temp/cache paths in writable directories:

$env:TMP='C:\tmp'; $env:TEMP='C:\tmp'
Remove-Item Env:\MEMORYFORGE_SUBAGENT_RUNNER -ErrorAction SilentlyContinue
Remove-Item Env:\MEMORYFORGE_MODEL -ErrorAction SilentlyContinue
uv run pytest --ignore=tests/test_real_subagents.py --basetemp=C:\tmp\memoryforge-pytest-basetemp -o cache_dir=.tmp\pytest-cache

$env:MEMORYFORGE_REAL_SUBAGENT='1'; $env:MEMORYFORGE_REAL_PROJECT_ROOT=(Get-Location).Path
$env:MEMORYFORGE_SUBAGENT_RUNNER='codex'; $env:MEMORYFORGE_MODEL='gpt-5.4'
uv run pytest tests/test_real_subagents.py -vv --basetemp=C:\tmp\memoryforge-pytest-basetemp-real -o cache_dir=.tmp\pytest-cache-real

The real Codex sub-agent smoke tests are local-only. Run them on a machine with the Codex CLI installed and authenticated if you want to verify runner="codex"; they are not part of CI/CD. Mock runners are only for targeted unit tests that verify MemoryForge's own control flow.

Release Notes For Maintainers

Before pushing or publishing:

Keep generated data out of the release: .venv/, caches, .coverage, .memoryforge/, .codebase-memory/, dist/, and benchmarks/results/.
Run the full gate on Python 3.10, 3.11, and 3.12 through CI.
Build the wheel and sdist with uv build.
Check distributions with twine check.
Prefer the Publish GitHub workflow with PyPI trusted publishing. Direct maintainer uploads may use twine upload with TWINE_USERNAME=__token__ and TWINE_PASSWORD supplied from the shell environment, never from a committed config file.

Documentation

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

6.1.5

Jun 29, 2026

6.1.4

Jun 29, 2026

This version

6.1.3

Jun 29, 2026

6.1.2

Jun 29, 2026

6.1.1

Jun 28, 2026

6.1.0

Jun 28, 2026

6.0.9

Jun 28, 2026

6.0.8

Jun 28, 2026

6.0.7

Jun 28, 2026

6.0.6

Jun 27, 2026

6.0.5

Jun 27, 2026

6.0.4

Jun 27, 2026

6.0.3

Jun 27, 2026

6.0.2

Jun 27, 2026

6.0.1

Jun 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

memfg-6.1.3.tar.gz (195.0 kB view details)

Uploaded Jun 29, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

memfg-6.1.3-py3-none-any.whl (164.2 kB view details)

Uploaded Jun 29, 2026 Python 3

File details

Details for the file memfg-6.1.3.tar.gz.

File metadata

Download URL: memfg-6.1.3.tar.gz
Upload date: Jun 29, 2026
Size: 195.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for memfg-6.1.3.tar.gz
Algorithm	Hash digest
SHA256	`7225f6913364ec4fa6d4be30a5cd4b75a6bd1a4e0d299df00f08c40fd3926094`
MD5	`35ac4685541a813115c9d7a4769e37f4`
BLAKE2b-256	`3b41f0f7194007c9f275f411fa0cd0fd504e9a6612942e68abb3d04cd6147502`

See more details on using hashes here.

File details

Details for the file memfg-6.1.3-py3-none-any.whl.

File metadata

Download URL: memfg-6.1.3-py3-none-any.whl
Upload date: Jun 29, 2026
Size: 164.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for memfg-6.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ca0bec5aeccc0d52f9615227d60dcf3ce0eda8a0c0f686ef6b7323dd20b64812`
MD5	`8465fced53b8c65fc599b768f8e2a34e`
BLAKE2b-256	`67dc1a30942a004d2b02bfba0c2bacc16970484ea9ac60bed6eddec776ce4f97`

See more details on using hashes here.

memfg 6.1.3

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

MemoryForge

What It Is For

Memory Layers

Install

PyPI 6.1.3 Post-Publish Smoke Test

Primary Hookless Session API

Optional Codex MCP Adapter

Optional Codex Hook Adapter

Basic Usage

Vector Recall

CLI Surface

Benchmarks

Development

Release Notes For Maintainers

Documentation

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes