mnesis

Lossless Context Management — persistent memory for LLM agent sessions

These details have not been verified by PyPI

Project description

mnesis logo

mnesis

Lossless Context Management for long-horizon LLM agents

LLMs suffer from context rot: accuracy degrades 30–40% before hitting nominal token limits — not because the model runs out of space, but because reasoning quality collapses as the window fills with stale content.

The standard fix — telling the model to "summarize itself" — is unreliable. The model may silently drop constraints, forget file paths, or produce a summary that is itself too large.

mnesis solves this by making the engine — not the model — responsible for memory. It is a Python implementation of the LCM: Lossless Context Management architecture.

Benchmarks

Evaluated on OOLONG, a long-context reasoning and aggregation benchmark. Both LCM-managed and Claude Code agents are built on Claude Opus 4.6; the gap comes entirely from context architecture.

The charts below compare LCM-managed context against Claude Code and unmanaged Opus 4.6 across context lengths from 8K to 1M tokens.

Score improvement over raw Opus 4.6 at each context length:

OOLONG benchmark — score improvement over raw Opus 4.6

Absolute scores vs raw Opus 4.6 baseline:

OOLONG benchmark — absolute scores

Raw Opus 4.6 uses no context management — scores collapse past 32K tokens.

How it works

Traditional agentic frameworks ("RLM" — Recursive Language Models) ask the model to manage its own context via tool calls. LCM moves that responsibility to a deterministic engine layer:

RLM vs LCM approach

The engine handles memory deterministically so the model can focus entirely on the task.

Key properties

	RLM (e.g. raw Claude Code)	mnesis
Context trigger	Model judgment	Token threshold
Summarization failure	Silent data loss	Three-level fallback — never fails
Tool output growth	Unbounded	Backward-scan pruner
Large files	Inline (eats budget)	Content-addressed references
Parallel workloads	Sequential or ad-hoc	`LLMMap` / `AgenticMap` operators
History	Ephemeral	Append-only SQLite log

Quick Start

uv add mnesis

import asyncio
from mnesis import MnesisSession

async def main():
    async with await MnesisSession.create(
        model="anthropic/claude-opus-4-6",
        system_prompt="You are a helpful assistant.",
    ) as session:
        result = await session.send("Explain the GIL in Python.")
        print(result.text)

asyncio.run(main())

No API key needed to try it — set MNESIS_MOCK_LLM=1 and run any of the examples.

Provider support

mnesis works with any LLM provider via litellm. Pass the model string and set the corresponding API key:

Provider	Model string	API key env var
Anthropic	`"anthropic/claude-opus-4-6"`	`ANTHROPIC_API_KEY`
OpenAI	`"openai/gpt-4o"`	`OPENAI_API_KEY`
Google Gemini	`"gemini/gemini-1.5-pro"`	`GEMINI_API_KEY`
OpenRouter	`"openrouter/meta-llama/llama-3.1-70b-instruct"`	`OPENROUTER_API_KEY`

See the Provider Configuration guide for the full provider configuration guide.

BYO-LLM — use your own SDK

If you already use the Anthropic, OpenAI, or another SDK directly, use session.record() to let mnesis handle memory and compaction without routing calls through litellm:

import anthropic
from mnesis import MnesisSession, TokenUsage

client = anthropic.Anthropic()
session = await MnesisSession.create(model="anthropic/claude-opus-4-6")

user_text = "Explain quantum entanglement."
response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": user_text}],
)

await session.record(
    user_message=user_text,
    assistant_response=response.content[0].text,
    tokens=TokenUsage(
        input=response.usage.input_tokens,
        output=response.usage.output_tokens,
    ),
)

Core Concepts

Immutable Store + Active Context

Every message and tool result is appended to an SQLite log and never modified. Each turn, the engine assembles a curated view of the log that fits the model's token budget — the Active Context.

Three-Level Compaction

When token usage crosses the threshold, the CompactionEngine escalates automatically:

Level 1 — Structured LLM summary: Goal / Discoveries / Accomplished / Remaining
Level 2 — Aggressive compression: drop reasoning, maximum conciseness
Level 3 — Deterministic truncation: no LLM, always fits, never fails

Compaction runs asynchronously and never blocks a turn.

Tool Output Pruning

The ToolOutputPruner scans backward through history and tombstones completed tool outputs that fall outside a configurable protect window (default 40K tokens). Tombstoned outputs are replaced by a compact marker in the context — the data is still in the immutable store.

Large File References

Files exceeding the inline threshold (default 10K tokens) are stored externally as FileRefPart objects with structural exploration summaries — AST outlines for Python, schema keys for JSON/YAML, headings for Markdown. The file is never re-read unless the model explicitly requests it.

Parallel Operators

LLMMap — stateless parallel LLM calls over a list of inputs with Pydantic schema validation and per-item retry. O(1) context cost to the caller.
AgenticMap — independent sub-agent sessions per input item, each with full multi-turn reasoning. The parent session sees only the final output.

Configuration

from mnesis import MnesisConfig, CompactionConfig, FileConfig

config = MnesisConfig(
    compaction=CompactionConfig(
        auto=True,
        buffer=20_000,               # tokens reserved for compaction output
        prune=True,
        prune_protect_tokens=40_000, # never prune within last 40K tokens
        prune_minimum_tokens=20_000, # skip pruning if volume is too small
        level2_enabled=True,
    ),
    file=FileConfig(
        inline_threshold=10_000,     # files > 10K tokens → FileRefPart
    ),
    doom_loop_threshold=3,           # consecutive identical tool calls before warning
)

Parameter	Default	Description
`compaction.auto`	`True`	Auto-trigger on overflow
`compaction.buffer`	`20,000`	Tokens reserved for summary output
`compaction.prune_protect_tokens`	`40,000`	Recent tokens never pruned
`compaction.prune_minimum_tokens`	`20,000`	Minimum prunable volume
`file.inline_threshold`	`10,000`	Inline limit in tokens
`operators.llm_map_concurrency`	`16`	`LLMMap` parallel calls
`operators.agentic_map_concurrency`	`4`	`AgenticMap` parallel sessions
`doom_loop_threshold`	`3`	Identical tool call threshold

Examples

All examples run without an API key via MNESIS_MOCK_LLM=1:

MNESIS_MOCK_LLM=1 uv run python examples/01_basic_session.py
MNESIS_MOCK_LLM=1 uv run python examples/05_parallel_processing.py

File	Demonstrates
`examples/01_basic_session.py`	`create()`, `send()`, context manager, compaction monitoring
`examples/02_long_running_agent.py`	EventBus subscriptions, streaming callbacks, manual `compact()`
`examples/03_tool_use.py`	Tool lifecycle, `ToolPart` streaming states, tombstone inspection
`examples/04_large_files.py`	`LargeFileHandler`, `FileRefPart`, cache hits, exploration summaries
`examples/05_parallel_processing.py`	`LLMMap` with Pydantic schema, `AgenticMap` sub-sessions
`examples/06_byo_llm.py`	`record()` — BYO-LLM, inject turns from your own SDK

Documentation

Full documentation is available at mnesis.lucenor.tech, including:

Architecture

MnesisSession
├── ImmutableStore         (SQLite append-only log)
├── SummaryDAGStore        (logical DAG over is_summary messages)
├── ContextBuilder         (assembles LLM message list each turn)
├── CompactionEngine       (three-level escalation + atomic commit)
│   ├── ToolOutputPruner   (backward-scanning tombstone pruner)
│   └── levels.py          (level 1 / 2 / 3 functions)
├── LargeFileHandler       (content-addressed file references)
├── TokenEstimator         (tiktoken + heuristic fallback)
└── EventBus               (in-process pub/sub)

Operators (independent of session)
├── LLMMap                 (parallel stateless LLM calls)
└── AgenticMap             (parallel sub-agent sessions)

Contributing

Contributions are welcome — bug reports, feature requests, and pull requests alike.

See CONTRIBUTING.md for the full guide. The short version:

# Clone and install with dev dependencies
git clone https://github.com/Lucenor/mnesis.git
cd mnesis
uv sync --group dev

# Lint + format
uv run ruff check src/ tests/
uv run ruff format src/ tests/

# Type check
uv run mypy src/mnesis

# Test (with coverage)
uv run pytest

# Run examples without API keys
MNESIS_MOCK_LLM=1 uv run python examples/01_basic_session.py
MNESIS_MOCK_LLM=1 uv run python examples/05_parallel_processing.py

# Build docs locally
uv sync --group docs
uv run mkdocs serve

Open an issue before starting large changes — it avoids duplicated effort.

License

Apache 2.0 — see LICENSE and NOTICE.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.3.0

Apr 4, 2026

0.2.0

Feb 25, 2026

0.1.1

Feb 20, 2026

This version

0.1.0

Feb 20, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mnesis-0.1.0.tar.gz (2.0 MB view details)

Uploaded Feb 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mnesis-0.1.0-py3-none-any.whl (65.0 kB view details)

Uploaded Feb 20, 2026 Python 3

File details

Details for the file mnesis-0.1.0.tar.gz.

File metadata

Download URL: mnesis-0.1.0.tar.gz
Upload date: Feb 20, 2026
Size: 2.0 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mnesis-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`f383d888709fda9d4305ced4b482343604608453ffb30ca43dcc6fc51b5f1807`
MD5	`2da7d07699122c5e2152b4e747ad9fb1`
BLAKE2b-256	`57824ce2b26abb551bb92609f57d2b1aee09f83ec69ded012bca45978c835ecf`

See more details on using hashes here.

Provenance

The following attestation bundles were made for mnesis-0.1.0.tar.gz:

Publisher: publish.yml on Lucenor/mnesis

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: mnesis-0.1.0.tar.gz
- Subject digest: f383d888709fda9d4305ced4b482343604608453ffb30ca43dcc6fc51b5f1807
- Sigstore transparency entry: 973360929
- Sigstore integration time: Feb 20, 2026
Source repository:
- Permalink: Lucenor/mnesis@c62fb3ab094d58fbf95e1079eaef117392dfc52e
- Branch / Tag: refs/heads/main
- Owner: https://github.com/Lucenor
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@c62fb3ab094d58fbf95e1079eaef117392dfc52e
- Trigger Event: workflow_dispatch

File details

Details for the file mnesis-0.1.0-py3-none-any.whl.

File metadata

Download URL: mnesis-0.1.0-py3-none-any.whl
Upload date: Feb 20, 2026
Size: 65.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mnesis-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`afd88888bbc7aa5a83f18ab03429b3d88088c8f52fe1c080a2b95a6520b2add8`
MD5	`18b545932ee27c797d7e037e7f12f221`
BLAKE2b-256	`1fd45a59fb70224f226af81495870b417d18cd5418a82491e1ce5165145628d2`

See more details on using hashes here.

Provenance

The following attestation bundles were made for mnesis-0.1.0-py3-none-any.whl:

Publisher: publish.yml on Lucenor/mnesis

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: mnesis-0.1.0-py3-none-any.whl
- Subject digest: afd88888bbc7aa5a83f18ab03429b3d88088c8f52fe1c080a2b95a6520b2add8
- Sigstore transparency entry: 973360930
- Sigstore integration time: Feb 20, 2026
Source repository:
- Permalink: Lucenor/mnesis@c62fb3ab094d58fbf95e1079eaef117392dfc52e
- Branch / Tag: refs/heads/main
- Owner: https://github.com/Lucenor
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@c62fb3ab094d58fbf95e1079eaef117392dfc52e
- Trigger Event: workflow_dispatch

mnesis 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Benchmarks

How it works

Key properties

Quick Start

Provider support

BYO-LLM — use your own SDK

Core Concepts

Immutable Store + Active Context

Three-Level Compaction

Tool Output Pruning

Large File References

Parallel Operators

Configuration

Examples

Documentation

Architecture

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance