MCP Server that brings long-term memory to AI coding tools via EverMemOS
Project description
evermemos-mcp
Universal long-term memory layer for AI coding assistants, powered by EverMemOS.
Built for the Memory Genesis Competition 2026 — Track 2: Platform Plugins
evermemos-mcp is an MCP (Model Context Protocol) server that gives any compatible AI client — Claude Code, Cursor, Cline, Cherry Studio, and more — persistent, cross-session memory. It bridges the gap between stateless AI conversations and the contextual awareness that real-world workflows demand.
Why This Exists
AI coding assistants forget everything between sessions. You explain your architecture, your preferences, your project context — and next session, it's all gone. evermemos-mcp solves this by providing a Memory → Reasoning → Action loop:
- Remember — Store decisions, preferences, and context as you work
- Recall — Retrieve relevant memories using hybrid search (keyword + vector + semantic)
- Brief — Get a full context restoration at the start of any new session
All memories are organized into isolated spaces (e.g. coding:my-app, study:ml-notes, chat:daily), so different projects and workflows never bleed into each other.
Demo
Final demo video will be added after the last recording pass.
Current submission-ready evidence:
- Primary benchmark summary:
artifacts/competition/2026-02-26-formal-real-auto-all-v3/benchmark_summary.json - Human-readable benchmark report:
artifacts/competition/2026-02-26-formal-real-auto-all-v3/benchmark_report.md - Evidence release (
runs.jsonl):competition-evidence-2026-02-26 - Latest lifecycle appendix:
artifacts/competition/2026-03-07-lifecycle-appendix-dec0612e/appendix_notes.md(remember/searchable/isolationpass;forgetremains a current Cloud limitation)
Features
| Tool | Description |
|---|---|
list_spaces |
Discover available memory spaces |
remember |
Store information into long-term memory (async extraction) |
request_status |
Check whether a queued write is still queued or has been reported completed by upstream |
recall |
Search memories with 6 retrieval strategies and label whether results are searchable, provisional, or fallback |
briefing |
Get a structured context briefing: profile + episodes + facts + foresights, with explicit fallback labeling when needed |
forget |
Attempt targeted memory deletion by ID (Cloud behavior may vary) |
fetch_history |
Paginate through memory timeline by type |
Key Capabilities
- Space isolation —
space_id(<domain>:<slug>) keeps memories separated by project or topic - Multi-space search — Query up to 10 spaces in a single
recallcall with automatic source attribution - Traceable citations — Every result includes
memory_type,snippet,timestamp,score, and optionalsource_message_id - Multi-user support — Optional
user_idfiltering for shared spaces - Conversation metadata sync — Automatic
conversation-metaintegration with EverMemOS Cloud - Async-friendly identity fallback — Chat identity/preferences are best-effort mirrored into
conversation-metaand can be surfaced as explicit fallback results when extracted search results are unavailable - Unified lifecycle semantics —
remember,request_status,recall, andbriefingexpose compatiblelifecycleblocks so clients can collectively distinguishqueued,provisional,fallback, andsearchable - Robust error handling — Retry with backoff (429 / 5xx), GET body fallback for proxy/WAF compatibility, and structured error codes
Quick Start
Get your API key from EverMemOS Cloud.
Option A: Install from PyPI (recommended)
No clone needed — just add to your MCP client config:
{
"mcpServers": {
"evermemos-mcp": {
"type": "stdio",
"command": "uvx",
"args": ["evermemos-mcp@latest"],
"env": {
"EVERMEMOS_API_KEY": "your-key-here",
"EVERMEMOS_USER_ID": "mcp-user"
}
}
}
}
Or run directly from the command line:
uvx evermemos-mcp@latest
If your client still launches an older cached build after a release, refresh it with:
uv cache clean evermemos-mcp
Option B: Install from source
git clone https://github.com/tt-a1i/evermemos-mcp.git
cd evermemos-mcp
cp .env.example .env
# Edit .env and set your EVERMEMOS_API_KEY
uv run evermemos-mcp
MCP client config for source installs:
{
"mcpServers": {
"evermemos-mcp": {
"type": "stdio",
"command": "uv",
"args": [
"run",
"--directory",
"/absolute/path/to/evermemos-mcp",
"evermemos-mcp"
],
"env": {
"EVERMEMOS_API_KEY": "your-key-here",
"EVERMEMOS_USER_ID": "mcp-user"
}
}
}
}
Client-specific setup guides (Claude Code, Cursor, Cline, Cherry Studio) are in docs/05-client-integrations.md.
Architecture
MCP Client (Claude Code / Cursor / Cline / Cherry Studio)
│
│ MCP stdio
▼
┌─────────────────────────────┐
│ evermemos-mcp server │
│ ┌───────────────────────┐ │
│ │ 7 Tool Handlers │ │
│ └──────────┬────────────┘ │
│ ┌──────────▼────────────┐ │
│ │ Memory Service │ │ remember / request_status / recall / briefing / forget / fetch_history
│ └──────────┬────────────┘ │
│ ┌──────────▼────────────┐ │
│ │ Space Catalog Service │ │ space registry, metadata sync, cross-session recovery
│ └──────────┬────────────┘ │
│ ┌──────────▼────────────┐ │
│ │ EverMemOS HTTP Client│ │ auth, retries, rate-limit backoff, error normalization
│ └──────────┬────────────┘ │
└─────────────┼───────────────┘
│ HTTPS
▼
EverMemOS Cloud API
- Cloud-first — All memories live in EverMemOS Cloud. No local persistence, no state to lose.
- Process-local cache — Space catalog is cached in-memory for fast lookups, recovered from Cloud on startup.
- Async extraction —
rememberqueues content for AI-powered extraction. Memories become searchable after processing.
Memory Lifecycle States
| State | Meaning | Typical surface |
|---|---|---|
queued |
The write was accepted, but formal extraction is not confirmed searchable yet | remember.lifecycle, request_status.lifecycle, recall.pending_count |
provisional |
The answer comes from pending_messages while extraction is still queued |
recall.results[].stability == provisional |
fallback |
The answer comes from mirrored conversation-meta, not formal extracted memory |
recall.results[].stability == fallback, briefing.highlights[].stability == fallback |
searchable |
The answer comes from formal extracted memories returned by search/fetch APIs | recall.results[].stability == searchable, briefing.highlights[].stability == searchable |
If a tool can answer using provisional or fallback data, that does not mean formal extraction has completed.
Space Templates
Use spaces to separate intent, not just data:
| Template | Use it for |
|---|---|
chat:preferences |
durable personal preferences, names, tone, UI likes/dislikes |
chat:daily |
ongoing chat context that should not leak into project memory |
coding:<repo> |
architecture decisions, conventions, bugs, and project context |
study:<topic> |
learning notes, topic progress, and revision context |
Why split spaces? Because "who I am", "what this repo needs", and "what I'm learning" should not overwrite each other.
Which Tool To Use
| Goal | Primary tool | Why |
|---|---|---|
| Start a new session | briefing |
fastest way to restore context in one call |
| Find the most relevant prior fact | recall |
relevance-ranked lookup across one or more spaces |
| Review what happened over time | fetch_history |
chronological timeline beats ranked search for audits and replay |
| Verify before or after deletion | fetch_history |
stable timeline check before trusting forget |
If recall feels unstable or too selective, switch to fetch_history instead of retrying the same search blindly.
Forget Safety
forget is currently a best-effort Cloud operation, not a guaranteed instant erase.
Recommended deletion flow:
- Use
fetch_historyorrecallto confirm the targetmemory_id. - Call
forget(memory_ids=[...], space_id=...). - Re-check with
fetch_historyfirst, thenrecallif needed. - If the target still appears, treat it as a current Cloud limitation rather than proof that routing was wrong.
Use Cases
Coding: Persistent Architecture Context
You: remember that we chose PostgreSQL over MongoDB because our data is highly relational
[space_id: coding:my-saas]
-- next day, new session --
You: what database did we choose and why?
→ recall finds: "Chose PostgreSQL over MongoDB — highly relational data model"
Study: Cross-Session Learning Notes
You: remember: bias-variance tradeoff — high bias = underfitting, high variance = overfitting
[space_id: study:ml-notes]
-- later --
You: briefing for study:ml-notes
→ Returns: profile (technical skills), recent episodes, key facts, foresights
Chat: Personal Preferences
You: remember I prefer dark mode, vim keybindings, and concise responses
[space_id: chat:preferences]
-- any future session --
You: recall my UI preferences
→ "Prefers dark mode, vim keybindings, concise responses"
Configuration
| Variable | Default | Description |
|---|---|---|
EVERMEMOS_API_KEY |
(required) | EverMemOS Cloud API key |
EVERMEMOS_USER_ID |
mcp-user |
Default user identity |
EVERMEMOS_BASE_URL |
https://api.evermind.ai |
API endpoint |
EVERMEMOS_API_VERSION |
v0 |
API version |
EVERMEMOS_ENABLE_CONVERSATION_META |
true |
Sync conversation metadata |
EVERMEMOS_DEFAULT_TIMEZONE |
UTC |
Timezone for metadata |
EVERMEMOS_DEFAULT_SPACE |
(auto) | Default space_id. If unset, auto-detected from git remote as coding:<repo-name> |
EVERMEMOS_LLM_CUSTOM_SETTING_JSON |
— | Custom LLM extraction settings |
EVERMEMOS_USER_DETAILS_JSON |
— | User profile details for conversations |
Space Auto-Detection
When space_id is omitted from remember or recall, the server automatically infers a default from:
EVERMEMOS_DEFAULT_SPACEenvironment variable (if set)- Git remote origin URL →
coding:<repo-name>(e.g.coding:my-saas)
This means inside a git project, you can simply call remember without specifying a space — memories are automatically routed to the right place.
flush Boundary Rules
flush controls when EverMemOS triggers memory extraction:
| Scenario | flush |
|---|---|
| Mid-conversation, more messages coming | false |
| End of session / topic switch / summary | true |
| Uncertain | true (safer default) |
Development
uv sync --group dev # Install dev dependencies
uv run ruff check # Lint
uv run pytest # Unit tests
EVERMEMOS_RUN_INTEGRATION_TESTS=true uv run pytest -m integration # Integration tests
CI runs on every push and PR via .github/workflows/ci.yml.
Documentation
| Document | Description |
|---|---|
docs/01-requirements.md |
Product requirements |
docs/02-architecture.md |
Technical architecture |
docs/03-demo-playbook.md |
Demo walkthrough |
docs/04-submission.md |
Submission checklist |
docs/05-client-integrations.md |
Client setup guides |
docs/06-benchmark.md |
Benchmark protocol and acceptance gates |
docs/07-release-checklist.md |
Release readiness checklist |
docs/competition/benchmark_deep_dive.md |
Primary evidence deep dive |
docs/auto-memory-prompt.md |
Auto-memory prompt templates for CLAUDE.md / Cursor / Cline |
CHANGELOG.md |
Version history |
License
See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file evermemos_mcp-0.4.6.tar.gz.
File metadata
- Download URL: evermemos_mcp-0.4.6.tar.gz
- Upload date:
- Size: 191.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1f7870f1f3910dd95f0e1b73f461af3e6dc57c0c949a85eb34c20a2080d58bc6
|
|
| MD5 |
c36e3f19d1bd8b2f5dab36aaa647015f
|
|
| BLAKE2b-256 |
9c010eab1655c14a335af85a467c96001ad6def60036c3b4d24aae3e3be09449
|
File details
Details for the file evermemos_mcp-0.4.6-py3-none-any.whl.
File metadata
- Download URL: evermemos_mcp-0.4.6-py3-none-any.whl
- Upload date:
- Size: 46.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f3a1a0c180a8c03364f7f647e4089292778ae829a259a8488cf259e7843abb2a
|
|
| MD5 |
6f1ec114fe8c5a77c6b46adafe38590e
|
|
| BLAKE2b-256 |
0f96abc046306677bfde7dbc4e52c427f137f290ac45046a9b874960df2f38bb
|