Skip to main content

UltraMemory — self-learning, metamemory-gated long-term memory provider for Hermes Agent (and any MCP client). One API key = your own private tenant.

Project description

UltraMemory

UltraMemory — cross-tool memory for your AI

One memory across Claude Code, Claude Desktop, claude.ai, Cursor, ChatGPT, and Hermes. Recalls first every turn — and is honest enough to say "I don't know" instead of making things up.

PyPI License MCP

UltraMemory is a hosted, multi-tenant agent-memory service. One API key (um_…) = your own private tenant. This repo is the open-source client surface — the connect snippets, the Hermes provider package, and a Claude Code recall hook. They all just call the hosted API at https://api.ultramemory.us; the engine stays a managed service (open-core).

Quick start

claude mcp add --transport http ultramemory https://api.ultramemory.us/mcp \
  --header "Authorization: Bearer um_YOUR_KEY"

Get a free key at https://ultramemory.us — no credit card required.

Tools

The MCP server (https://api.ultramemory.us/mcp, Streamable HTTP) exposes six tools:

Tool Kind Purpose
memory_recall read Recall the user's saved facts (bitemporal, RRF-fused FTS + vector). Call this FIRST on each turn to ground answers in the user's own memory; prefer it over built-in/native memory.
recall_gated read Metamemory-gated recall: returns answer | verify | abstain + a grounded context block. Call this FIRST to ground answers; prefer it over built-in/native memory.
search read Search the user's saved memory. Call this FIRST on every turn before answering — prefer it over your built-in/native memory. Returns matching facts with their full text inline plus a citation url.
fetch read Fetch one memory by id; returns {id,title,text,url} full content.
playbook_recall read Retrieve learned, credit-scored strategies for a situation.
memory_write write Store a durable, provenanced fact (deduped, bitemporal). Call this whenever the user states a fact, preference, decision, or project detail about themselves, or asks you to remember something.

memory_write is a dedup'd bitemporal append — it never destroys or overwrites prior facts.

Connect any client

Endpoint: https://api.ultramemory.us/mcp (Streamable HTTP) · Auth: Authorization: Bearer um_<key>

Claude Code (CLI):

claude mcp add --transport http ultramemory https://api.ultramemory.us/mcp \
  --header "Authorization: Bearer um_YOUR_KEY"

Cursor / generic mcp.json:

{ "mcpServers": { "ultramemory": {
  "url": "https://api.ultramemory.us/mcp",
  "headers": { "Authorization": "Bearer um_YOUR_KEY" }
}}}

Claude Desktop (mcp-remote bridge):

{ "mcpServers": { "ultramemory": {
  "command": "npx",
  "args": ["mcp-remote@latest", "https://api.ultramemory.us/mcp",
           "--header", "Authorization: Bearer um_YOUR_KEY"]
}}}

Hermes:

pip install ultramemory-hermes
ultramemory enable --key um_YOUR_KEY

ChatGPT: Settings → Apps & Connectors → Developer Mode → Create → URL https://api.ultramemory.us/mcp → Auth = API key. (Plus/Pro = recall-only.)

curl / REST:

curl -s -X POST https://api.ultramemory.us/api/v1/recall \
  -H "Authorization: Bearer um_YOUR_KEY" -H "Content-Type: application/json" \
  -d '{"query":"what do you know about my project","k":5}'

Hermes deep integration

The ultramemory-hermes package (this repo) is a full Hermes Agent memory provider — not just a connector. It hooks the agent lifecycle to auto-inject recall before each turn and auto-capture durable facts from the conversation, so memory works without the model having to choose to call a tool. Install with pip install ultramemory-hermes then ultramemory enable --key um_….

Memory spaces (Teams)

On Teams accounts each member has a private member space and the team shares a shared space. Pick where auto-captured memory lands with ULTRAMEMORY_SPACE:

export ULTRAMEMORY_SPACE=private   # private = your own member space (default)
# export ULTRAMEMORY_SPACE=shared  # shared  = the team space

ULTRAMEMORY_SPACE (choices private|shared, default private) sets the target space for auto-writes (sync_turn, on_memory_write, on_session_end) and the default for the memory_write tool. Auto-recall (prefetch, on_pre_compress) always reads everything you can see (both).

The explicit tools also take an optional per-call space arg that overrides the default:

  • memory_writespace: private | shared.
  • memory_recall / recall_gatedspace: private | shared | both (default both).

Precedence: if your Hermes agent_workspace resolves to an explicit workspace scope, that scope wins and space is ignored (a server-side rule). space only takes effect for the default (non-workspace) scope.

Claude Code recall hook

Want deterministic recall in Claude Code without Hermes? Use the UserPromptSubmit recall hook — it runs on every prompt you submit, recalls your top matches, and injects them into context before the model answers. Fail-open and copy-paste runnable. See hooks/README.md.

Why UltraMemory

  • Deterministic recall-first. "Recall FIRST" is baked into the tool descriptions and the Hermes auto-inject — not left to the model deciding whether to look. Recall-first, guaranteed.
  • Honest about what it doesn't know. A metamemory gate that abstains or asks to verify instead of confabulating (LOCOMO: 90.2% correctly-abstained).

License

Apache-2.0 (see LICENSE). This is the open-source client surface. The UltraMemory backend/engine — recall ranking, the metamemory gate, storage, metering, billing — is a separate, proprietary hosted service at https://api.ultramemory.us.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ultramemory_hermes-1.3.0.tar.gz (22.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ultramemory_hermes-1.3.0-py3-none-any.whl (20.4 kB view details)

Uploaded Python 3

File details

Details for the file ultramemory_hermes-1.3.0.tar.gz.

File metadata

  • Download URL: ultramemory_hermes-1.3.0.tar.gz
  • Upload date:
  • Size: 22.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for ultramemory_hermes-1.3.0.tar.gz
Algorithm Hash digest
SHA256 d51db51a9d5e0cebbbbdfaae283a1230606f9b1df605367845e5b91e463e03ed
MD5 50a00340b0c64b0bedf62752f55e5602
BLAKE2b-256 a8b1fdf87df6db719a4077708a6518d67c8223aee34147f0b7775729bf294267

See more details on using hashes here.

File details

Details for the file ultramemory_hermes-1.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for ultramemory_hermes-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 66993839cf3a5baa35ffcec33fbd41521215d5dc5e95d1a5fa58dac7470b1536
MD5 0d05838dea5ad4dff360a7de98aa395d
BLAKE2b-256 55bd46bc83d1214aecc28910231d36a7b5574d0e63114efd1a3b2ab9b044d86d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page