Short-term memory proxy gateway with proactive memory surfacing for AI agents

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

memtomem

These details have not been verified by PyPI

Project description

memtomem-stm

Official website & docs: https://memtomem.com

🚧 Alpha — APIs and defaults may change between 0.1.x releases. Feedback and issue reports are especially welcome: Issues · Discussions.

Spend fewer tokens. Remember more. Ship faster.

memtomem-stm is an MCP proxy that typically cuts token usage by 20–80% and gives your agent memory across sessions — with no changes to your upstream MCP servers.

It sits between your AI agent and its upstream MCP servers, compressing tool responses, caching repeated calls, and automatically surfacing relevant context from prior sessions via a memtomem LTM server.

What memtomem-stm does:

Cuts token spend on repeated reads — compresses and caches tool responses, so the agent doesn't re-pay for the same file or search result. Works with Claude Code, Cursor, Claude Desktop, or any MCP client.
Carries context across sessions — surfaces prior decisions from memtomem LTM automatically, so the agent picks up where it left off rather than re-discovering what it already knew.
Drops in front of any MCP server — adds compression, caching, and observability as a proxy layer, without changes to upstream code.

flowchart TB
    Agent["Agent<br/>(Claude Code, Cursor, …)"]
    subgraph STM["memtomem-stm (STM)"]
        Pipe["CLEAN → COMPRESS → SURFACE → (INDEX)"]
    end
    LTM[("memtomem LTM<br/>(MCP server)")]
    FS["filesystem<br/>MCP server"]
    GH["github<br/>MCP server"]
    Other["…any MCP server"]

    Agent -->|MCP| STM
    STM <-->|MCP: stdio / SSE / HTTP| FS
    STM <-->|MCP| GH
    STM <-->|MCP| Other
    STM <-.->|surfacing<br/>via MCP| LTM

The INDEX stage requires a FileIndexer engine. The standalone mms server does not wire one today, so auto_index and extraction config is inert in the default deployment — enabling them logs an inert warning at startup but does not write back to LTM. See #288 for the tracking issue on a future MCP-protocol-only adapter.

Installation

pip install memtomem-stm

Or with uv:

uv tool install memtomem-stm     # install mms / memtomem-stm as global CLI tools
uvx memtomem-stm --help          # or run without installing
uv pip install memtomem-stm      # or install into the active environment

memtomem-stm is independent: it has no Python-level dependency on memtomem core. To enable proactive memory surfacing, point STM at a running memtomem MCP server (or any compatible MCP server) — communication happens entirely through the MCP protocol.

Quick Start

mms is the short alias for memtomem-stm-proxy — both commands are identical, use whichever you prefer.

1. Add an upstream MCP server

For first-time setup, run the guided wizard — it prompts for name/prefix/command, optionally probes the server, and then offers to register STM with Claude Code (or generate .mcp.json) in the same flow:

mms init

Or add servers non-interactively:

mms add filesystem \
  --command npx \
  --args "-y @modelcontextprotocol/server-filesystem /home/user/projects" \
  --prefix fs

--prefix is required: it's the namespace under which the upstream server's tools will appear (e.g. fs__read_file). Repeat for each MCP server you want to proxy.

If you've already configured MCP servers in Claude Desktop, Claude Code, or a project .mcp.json, mms add --import (alias --from-clients) reuses the init wizard to bulk-select them — skipping anything already registered.

mms list      # show what you've added
mms status    # show full config + connectivity

2. Connect your AI client to STM

mms init ends with a 3-way prompt — pick option 1 and it shells out to claude mcp add for you. If you skipped that step or want to register with a different client later, run:

mms register

To register manually, use claude directly:

claude mcp add mms -s user -- mms

Or add it to a JSON MCP config for Cursor / Windsurf / Claude Desktop / Gemini:

{
  "mcpServers": {
    "mms": {
      "command": "mms"
    }
  }
}

Why mms and not memtomem-stm? Either name works (the three entry points are interchangeable), but the MCP client composes proxied tool names as mcp__<server>__<prefix>__<tool>. The short alias mms (3 chars) saves 9 bytes vs memtomem-stm (12 chars), which is exactly enough headroom to keep upstreams with long tool names under the 64-char MCP limit. If you registered under a different name and want the mms add overflow check (#261) to match exactly, export MMS_CLIENT_SERVER_NAME=<name> in your shell — otherwise the default assumption is conservative and at worst causes a few false-positive warnings on borderline prefixes.

3. Use the proxied tools

Your agent now sees proxied tools (fs__read_file, gh__search_repositories, etc.). The CLEAN / COMPRESS / SURFACE stages run automatically — responses are cleaned, compressed, cached, and (when an LTM server is configured) enriched with relevant memories. The INDEX stage (auto_index / extraction) is currently inactive in the standalone server; see #288.

To check what's happening, ask the agent to call stm_proxy_stats.

What STM proxies — and what it doesn't

STM is an MCP proxy: it sees a tool call only if the client routes that call through the MCP protocol. Coverage depends on how your client invokes the tool, not on what the tool does.

STM sees: any MCP server you register with mms add — every tool under the mcp__<server>__<prefix>__<tool> namespace — plus LTM surfacing calls to a configured memtomem server.

STM does NOT see:

Claude Code's built-in tools — Read, Write, Edit, Bash, Grep, Glob, WebFetch. They run inside the client and never reach an MCP server, so their token spend is invisible to STM and unaffected by compression or caching.
Cursor / Windsurf / Claude Desktop built-ins — same principle: anything the client provides natively bypasses the MCP layer.
Sub-agent built-in calls — the parent's MCP wiring is inherited, but built-in tool calls inside an Agent / Task invocation stay client-internal.

STM does NOT write back to LTM at runtime today. The standalone mms server constructs the proxy without a FileIndexer engine, so the INDEX stage (auto_index, extraction) is inert even when enabled in stm_proxy.json — a warning is logged at startup. Surfacing reads from LTM via MCP; runtime writes are tracked in #288 and require an MCP-protocol-only adapter that doesn't exist yet.

To bring file or shell operations under STM, register an MCP server that exposes them (the filesystem example above is the most common case) and steer the agent toward the proxied alias instead of the built-in. This is the same boundary every MCP proxy lives within — it's not specific to STM.

Project-scoped MCPs (`mms project` + `mms import`)

A second tier of management lets you decide which MCP servers a given project sees, separately from the STM proxy gateway config. State lives in a new dotdir, ~/.mms/:

mms import --from claude-code — pull existing MCP definitions out of ~/.claude.json, ~/.cursor/mcp.json, ~/.codex/config.toml, or Claude Desktop's config into ~/.mms/registry.toml (secrets redacted in --plan, written verbatim under --apply).
mms project init — create a <project>/.mms/project.toml marker (commit-recommended).
mms project enable filesystem github — declare which MCPs that project wants visible.
mms project list / mms project show — inspect the index and the current project.

~/.mms/ is intentionally separate from ~/.memtomem/ — STM proxy bootstrap (stm_proxy.json) and mms project state (registry.toml) are fully disjoint in W1: mms add writes only stm_proxy.json, mms import --apply writes only registry.toml. See docs/cli.md for the full reference.

Tutorial notebooks

Try it without wiring into your AI client first. A quickstart Jupyter notebook registers an upstream MCP server, calls a proxied tool, and reads stm_proxy_stats end-to-end. Clone the repo, uv sync, and uv run jupyter lab notebooks/ — no external services needed.

Key Features

🗜️ Typically 20–80% fewer tokens per tool call — 10 compression strategies with auto-selection by content type, query-aware budget, and zero-loss progressive delivery → docs/compression.md
🧠 Your agent remembers — proactive memory surfacing from prior sessions, gated by relevance threshold, rate limit, dedup, and circuit breaker → docs/surfacing.md
💾 Repeated calls are free — response cache with TTL and eviction; surfacing re-applied on cache hit so injected memories stay fresh → docs/caching.md
🛡️ Production-safe — circuit breaker, retry with backoff, write-tool skip, query cooldown, dedup, sensitive content auto-detection, Langfuse tracing, horizontal scaling via PendingStore

Documentation

Guide	Topic
Surfacing	How agents recall prior context automatically
Compression	All 10 strategies — pick the right one for your content
Caching	Skip repeated work with response caching
Configuration	Tune settings without touching code
CLI	CLI commands and the 11 MCP tools

Development

uv sync                                                    # install dev deps
uv run pytest -m "not ollama and not bench_qa_meta and not bench_qa_llm_judge and not bench_qa_sweep"   # tests (CI filter)
uv run ruff check src && uv run ruff format --check src    # lint (required)
uv run mypy src                                            # typecheck (advisory)

CI runs the same commands on every PR via .github/workflows/ci.yml. Lint (ruff check + ruff format --check) and tests must pass; mypy is advisory.

License

Apache License 2.0. Contributions are accepted under the terms of the Contributor License Agreement.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

memtomem

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.25

Jun 2, 2026

This version

0.1.24

May 31, 2026

0.1.23

May 6, 2026

0.1.22

May 2, 2026

0.1.21

Apr 29, 2026

0.1.20

Apr 27, 2026

0.1.19

Apr 26, 2026

0.1.18

Apr 25, 2026

0.1.17

Apr 24, 2026

0.1.16

Apr 22, 2026

0.1.15

Apr 21, 2026

0.1.14

Apr 21, 2026

0.1.13

Apr 21, 2026

0.1.12

Apr 20, 2026

0.1.11

Apr 19, 2026

0.1.10

Apr 19, 2026

0.1.9

Apr 19, 2026

0.1.8

Apr 19, 2026

0.1.7

Apr 13, 2026

0.1.6

Apr 13, 2026

0.1.5

Apr 12, 2026

0.1.4

Apr 12, 2026

0.1.3

Apr 11, 2026

0.1.2

Apr 10, 2026

0.1.1

Apr 10, 2026

0.1.0

Apr 10, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

memtomem_stm-0.1.24.tar.gz (959.3 kB view details)

Uploaded May 31, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

memtomem_stm-0.1.24-py3-none-any.whl (300.1 kB view details)

Uploaded May 31, 2026 Python 3

File details

Details for the file memtomem_stm-0.1.24.tar.gz.

File metadata

Download URL: memtomem_stm-0.1.24.tar.gz
Upload date: May 31, 2026
Size: 959.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for memtomem_stm-0.1.24.tar.gz
Algorithm	Hash digest
SHA256	`f42ecd7c5606893c0d9ec40d806a0847a03fdc08927e2b2a483fdd8494d03345`
MD5	`3ac7c0f8f069d44ef08915c66404a13e`
BLAKE2b-256	`b8b6db4a686ad3cdf797a2d9008fad880817a4dd07460e3d341fc663eccfb57e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for memtomem_stm-0.1.24.tar.gz:

Publisher: release.yml on memtomem/memtomem-stm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: memtomem_stm-0.1.24.tar.gz
- Subject digest: f42ecd7c5606893c0d9ec40d806a0847a03fdc08927e2b2a483fdd8494d03345
- Sigstore transparency entry: 1686815329
- Sigstore integration time: May 31, 2026
Source repository:
- Permalink: memtomem/memtomem-stm@32db71dfd159a324430bba887f6086446c94f04f
- Branch / Tag: refs/tags/v0.1.24
- Owner: https://github.com/memtomem
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@32db71dfd159a324430bba887f6086446c94f04f
- Trigger Event: push

File details

Details for the file memtomem_stm-0.1.24-py3-none-any.whl.

File metadata

Download URL: memtomem_stm-0.1.24-py3-none-any.whl
Upload date: May 31, 2026
Size: 300.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for memtomem_stm-0.1.24-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3e54758ec59d116e857914f5f48529e6064f8899d908adcc9dfd6e8fe37fbefc`
MD5	`2a90193cace6f3394cce9b09a8469f88`
BLAKE2b-256	`161ea9ca4397ec350370eb1250dfb5c3a664dff1e1fdab40957a5184e154f118`

See more details on using hashes here.

Provenance

The following attestation bundles were made for memtomem_stm-0.1.24-py3-none-any.whl:

Publisher: release.yml on memtomem/memtomem-stm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: memtomem_stm-0.1.24-py3-none-any.whl
- Subject digest: 3e54758ec59d116e857914f5f48529e6064f8899d908adcc9dfd6e8fe37fbefc
- Sigstore transparency entry: 1686815411
- Sigstore integration time: May 31, 2026
Source repository:
- Permalink: memtomem/memtomem-stm@32db71dfd159a324430bba887f6086446c94f04f
- Branch / Tag: refs/tags/v0.1.24
- Owner: https://github.com/memtomem
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@32db71dfd159a324430bba887f6086446c94f04f
- Trigger Event: push

memtomem-stm 0.1.24

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

memtomem-stm

Installation

Quick Start

1. Add an upstream MCP server

2. Connect your AI client to STM

3. Use the proxied tools

What STM proxies — and what it doesn't

Project-scoped MCPs (mms project + mms import)

Tutorial notebooks

Key Features

Documentation

Development

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

Project-scoped MCPs (`mms project` + `mms import`)