matrix-context

Local-first, inspectable Mixture-of-Contexts engine and MCP server for agent memory

These details have not been verified by PyPI

Project description

The inspectable context layer for agent memory

Matrix Context routes each query to a small set of typed context experts, retrieves with hybrid lexical + dense fusion, and assembles a token‑budgeted, fully explainable context pack — so any agent gets the right context, in less of it, and you can see exactly why.

Quickstart · Live demo · Tutorials · Architecture · The standard · Benchmark · Cite

Overview

Classic retrieval‑augmented generation embeds everything into one flat index and retrieves the nearest chunks for every query. For agent memory — which mixes user preferences, project decisions, code, policies, episodes, and documents — that is wasteful and opaque: it spends the prompt budget indiscriminately and cannot explain its choices.

Matrix Context implements Mixture‑of‑Contexts retrieval (MoC‑RAG). It treats the memory store as a set of typed context experts (session, profile, semantic, episodic, document, policy), routes each query to the smallest useful subset, retrieves inside them, and packs the result under a token budget scored by relevance, importance, recency, and a redundancy penalty. Every selection is explainable through inspect() and the /v1/inspect API.

It is local‑first (single‑file SQLite, a numpy‑only core, zero model download), and it is a standard: the public wire contract is frozen as MoC Contract v1 with an executable conformance suite, so any storage engine, embedder, or framework can implement the same inspectable behaviour.

Capability	What it means
Typed routing	A two‑tier hybrid router (centroid + keyword + type + scope + activity priors) selects the right experts before retrieving, and widens on uncertainty.
Hybrid retrieval	BM25 + dense vectors fused with Reciprocal Rank Fusion — robust when either channel is weak.
Budgeted assembly	Greedy pack under a token budget scored by relevance · importance · recency − redundancy (MMR).
Inspectable	`inspect()`, `POST /v1/inspect`, and a built‑in Context Inspector UI expose routing scores and every kept/dropped item with a score breakdown.
Standard contract	JSON Schema 2020‑12 + OpenAPI 3.1 + MCP mapping + SemVer policy, with `python -m moc_contract.conformance`.
Benchmarked	A public, reproducible benchmark with paraphrased/adversarial robustness splits.

How it works

Matrix Context has two paths that meet at one typed memory store: a write path (you ingest data) and a read path (an agent recalls context).

How Matrix Context works: ingest path and recall path

Where do I put my data? Ingest on the write path — call ctx.remember(...) from the SDK or POST /v1/remember over HTTP — for anything you want an agent to recall later: documents and files, chats and sessions, decisions, user preferences, policies, and tool/API outputs. Each item is tagged with a type (which expert it belongs to) and a scope (e.g. project:acme or user:42), and stored in SQLite with an embedding.

What happens at query time? On the read path, the hybrid router selects the few experts a query actually needs, retrieval runs inside them (BM25 + dense), results are reranked and packed under a token budget, and the pack is handed to your LLM/agent. Every decision — selected vs. dropped experts, scores, and reasons — is available through inspect() and /v1/inspect.

Production tip: keep scopes per tenant/user/project so recall stays isolated, set importance and TTL on writes, and treat SQL as the system of record (vectors are a rebuildable accelerator).

Install

pip install matrix-context                 # core — zero model download
pip install "matrix-context[embeddings]"   # + a real semantic embedder (recommended)
pip install "matrix-context[all]"          # + mcp, postgres, milvus, conformance

Quickstart

Three lines of Python — give any agent memory:

import matrix_context as mc

memory = mc.open("demo")
memory.add("The team uses Postgres for production.")
print(memory.ask("What database do we use?"))     # prompt-ready context
print(memory.inspect("What database do we use?"))  # why each item won

…or five lines on the command line (mc and matrix-context are the same tool):

pip install matrix-context
mc init demo
mc add "The team uses Postgres for production." --expert semantic
mc ask "What database do we use?"
mc inspect "What database do we use?"

That's the whole loop: add to remember (text, a file, a folder, or a URL), ask for a prompt-ready pack, inspect to see why memory was selected. mc doctor checks your setup; mc list / mc forget manage items; mc serve --ui (or mc ui) opens the Console.

Three levels, one engine

The beginner API is a thin wrapper — the advanced API is always there underneath.

# Beginner — 90% of users
import matrix_context as mc
memory = mc.open("demo")
memory.add("The team uses Postgres."); memory.ask("which db?")

# Agent developer — a clean chat loop
def chat(user_message):
    context = memory.context_for(user_message)
    answer = llm(f"Relevant memory:\n{context}\n\nUser:\n{user_message}")
    memory.record_turn(user_message, answer)
    return answer

# Advanced / research — the full engine (unchanged)
from matrix_context import ContextManager
ctx = ContextManager.create("demo", path="demo.db")
pack = ctx.build_pack("which db?", scope="project:demo", max_tokens=400)

Use case	API
Beginner	`mc.open`, `memory.add`, `memory.ask`, `memory.inspect`
Agent developer	`memory.context_for`, `memory.record_turn`
Advanced / research	`ContextManager`, `build_pack`, `inspect`

# REST server + UIs  ->  Inspector at http://127.0.0.1:8088/ , Console at /console
mc serve --transport rest --port 8088     # add --ui to open the Console in a browser

# Full control plane / admin UI (also the Hugging Face demo)  ->  http://127.0.0.1:7860
python frontend/server.py

Reproduce everything in five minutes, offline, with no model download:

git clone https://github.com/agent-matrix/matrix-context && cd matrix-context
make install                                # pip install -e ".[dev]"
make test                                   # full suite incl. an end-to-end test
make eval                                   # routed vs. flat RAG (feasibility)
make conformance                            # -> MoC API v1 Compatible ✓
make benchmark                              # build dataset + robustness comparison

Tutorials

Practical, copy‑paste guides — start here:

Build your first chatbot — a beginner‑first guide to the build_pack → remember → inspect loop (no API keys for the first example).
Integrate with LangChain, LangGraph & CrewAI — runnable demos that download a real document, ingest it, and query it from each framework; includes the advantages over flat RAG / a vector DB and how it scales for the enterprise.
Console walkthrough — a tour of the control plane with screenshots, plus a medical‑assistant demo with a quality check.

Architecture

query → hybrid route → retrieve in selected experts → rerank → budgeted pack → explain

SQL is the source of truth (metadata, governance); vectors are an accelerator. The same engine is exposed through a Python SDK, a CLI, and a REST surface, with an MCP binding mapping the same objects to tools and resources. See docs/architecture.md, docs/routing.md, and the routing diagram.

The standard: MoC Contract v1

Matrix Context is positioned as a protocol and inspectability standard, not just an engine. moc_contract/ freezes a versioned public contract:

20 JSON Schema (2020‑12) wire objects and an OpenAPI 3.1 description of the /v1 surface;
an MCP mapping (REST is the source of truth, MCP the interop binding);
a SemVer compatibility policy (contract_version is independent of the package version);
an executable conformance suite — a server is MoC API v1 Compatible when it passes it.

python -m moc_contract.conformance --url http://127.0.0.1:8088   # -> MoC API v1 Compatible ✓
python -m moc_contract.badges                                    # regenerate the README badges

The load‑bearing, differentiating object is the inspect response: selected vs. unselected experts, per‑expert routing scores, kept items with score breakdowns, dropped items with reasons, and the prompt‑ready pack.

Benchmark

The MoC‑RAG Benchmark is a public, reproducible suite (1,000 typed items, 600 queries, six domains, five hard‑negative kinds) with parallel keyword / paraphrased / adversarial query splits. It supports a careful, evidence‑based claim:

MoC‑RAG improves robustness and context efficiency for typed agent memory under paraphrased and adversarial retrieval conditions — it does not universally beat all RAG. BM25 remains strong on keyword‑aligned queries; under adversarial lexical shift BM25 drops ~36 points while MoC‑RAG holds within ~17 and overtakes it, carrying roughly half the hard distractors of the dense baseline family at 95–100% routing accuracy.

Recall@8 (real embedder)	keyword	paraphrased	adversarial
`bm25_rag`	100%	81%	64%
`moc_rag_e3`	96%	89%	79%

Dataset: ruslanmv/moc-rag-benchmark · full results and interpretation in benchmarks/moc_rag_benchmark/results/FINDINGS.md.

Documentation

Topic	Link
Tutorials	`tutorials/` (chatbot · frameworks · console)
Architecture & routing	`docs/architecture.md`, `docs/routing.md`
REST API & Inspector UI	`docs/rest.md`
Control plane / admin UI (native)	`frontend/`
Hugging Face Space (packaging)	`hf/`
MoC Contract v1	`moc_contract/README.md`
Adapters (agent‑generator, HomePilot)	`docs/adapters/`
Benchmark	`benchmarks/README.md`
Manuscript (LaTeX)	`docs/paper/latex/`
Changelog · Contributing · Release	`docs/CHANGELOG.md` · `docs/CONTRIBUTING.md` · `docs/RELEASE.md`
Project structure	`docs/PROJECT_STRUCTURE.md`

Status

0.1.0 ships the engine (routing, hybrid retrieval, budgeted packing, inspect), the SQLite store, the Python SDK and CLI, the evaluation harness, a v1 REST surface implementing MoC Contract v1 with a conformance suite and the Context Inspector UI, the agent‑generator and HomePilot adapters, and the MoC‑RAG Benchmark. The MCP server, governance plane, memory lifecycle (dedup / contradiction / consolidation), and Postgres/pgvector are scaffolded and staged for v1; Milvus and a learned router for v2. See docs/PROJECT_STRUCTURE.md.

Citation

This repository is the official reference implementation for the manuscript Matrix Context: Mixture‑of‑Contexts RAG for Robust and Inspectable Agent Memory (see docs/paper/latex/). If you use Matrix Context or the MoC‑RAG Benchmark, please cite it. Citation metadata is in CITATION.cff and .zenodo.json; a DOI will be minted on the tagged release.

@software{matrix_context_2026,
  title     = {Matrix Context: Mixture-of-Contexts RAG for Robust and Inspectable Agent Memory},
  author    = {Magana Vsevolodovna, Ruslan},
  year      = {2026},
  url        = {https://github.com/agent-matrix/matrix-context},
  note      = {Independent Researcher, Genova, Italy. DOI forthcoming.}
}

The Console (live demo)

A wired control plane / admin UI ships in frontend/ and is deployed as a Hugging Face Space (hf/). Full walkthrough + a medical-assistant demo in tutorials/.

Overview	Inspector (the "why")	Integrate an agent

python frontend/server.py     # -> http://127.0.0.1:7860   (Inspector also at matrix-context serve → / and /console)

Acknowledgements

Part of the Agent‑Matrix ecosystem: Matrix Hub catalogs and installs, agent‑generator generates, HomePilot proves local‑first memory, and Matrix Context is the runtime context plane underneath.

License

Apache‑2.0 © Ruslan Magana Vsevolodovna — Independent Researcher, Genova, Italy.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Jun 5, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

matrix_context-0.1.0.tar.gz (63.3 kB view details)

Uploaded Jun 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

matrix_context-0.1.0-py3-none-any.whl (78.8 kB view details)

Uploaded Jun 5, 2026 Python 3

File details

Details for the file matrix_context-0.1.0.tar.gz.

File metadata

Download URL: matrix_context-0.1.0.tar.gz
Upload date: Jun 5, 2026
Size: 63.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for matrix_context-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`1a0b2a68c1bd1172a74dfb6f090cc3fcfb2906f7c0b3cc676446d34cab38f3fa`
MD5	`2b80695ab75b16ce36d092f632659f1b`
BLAKE2b-256	`9e1722c354a04197b3eea7bdc5cd03b2377ed0ce3e26ce870a93323fffc5fd6c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for matrix_context-0.1.0.tar.gz:

Publisher: release.yml on agent-matrix/matrix-context

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: matrix_context-0.1.0.tar.gz
- Subject digest: 1a0b2a68c1bd1172a74dfb6f090cc3fcfb2906f7c0b3cc676446d34cab38f3fa
- Sigstore transparency entry: 1734557519
- Sigstore integration time: Jun 5, 2026
Source repository:
- Permalink: agent-matrix/matrix-context@4a5615548c6a7e2a60ec9f6805d6b8947950f54a
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/agent-matrix
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@4a5615548c6a7e2a60ec9f6805d6b8947950f54a
- Trigger Event: release

File details

Details for the file matrix_context-0.1.0-py3-none-any.whl.

File metadata

Download URL: matrix_context-0.1.0-py3-none-any.whl
Upload date: Jun 5, 2026
Size: 78.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for matrix_context-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b8f2a1561a4c32ef25182029af7a03fedf68537c5dd0d6d15b2d48d49cc207ea`
MD5	`b9e8f314cab5de5569c467697df76085`
BLAKE2b-256	`75813414bcdafd9c3eaf0b579a3734a12e7db2b30d4992814a57f5b3bbb170cd`

See more details on using hashes here.

Provenance

The following attestation bundles were made for matrix_context-0.1.0-py3-none-any.whl:

Publisher: release.yml on agent-matrix/matrix-context

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: matrix_context-0.1.0-py3-none-any.whl
- Subject digest: b8f2a1561a4c32ef25182029af7a03fedf68537c5dd0d6d15b2d48d49cc207ea
- Sigstore transparency entry: 1734557532
- Sigstore integration time: Jun 5, 2026
Source repository:
- Permalink: agent-matrix/matrix-context@4a5615548c6a7e2a60ec9f6805d6b8947950f54a
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/agent-matrix
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@4a5615548c6a7e2a60ec9f6805d6b8947950f54a
- Trigger Event: release

matrix-context 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

The inspectable context layer for agent memory

Overview

How it works

Install

Quickstart

Three levels, one engine

Tutorials

Architecture

The standard: MoC Contract v1

Benchmark

Documentation

Status

Citation

The Console (live demo)

Acknowledgements

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance