Skip to main content

Quantum-optimized knowledge graph memory for AI agents. Relationship-aware subgraph selection via QAOA.

Project description

Quantum Memory Graph โš›๏ธ๐Ÿง 

Relationship-aware memory for AI agents. Knowledge graphs + quantum-optimized subgraph selection.

Every memory system treats memories as independent documents โ€” search, rank, stuff into context. But memories aren't independent. They have relationships. "The team chose React" becomes 10x more useful paired with "because of ecosystem maturity" and "FastAPI handles the backend."

Quantum Memory Graph maps these relationships, then uses QAOA to find the optimal combination of memories โ€” not just the most relevant individuals, but the best connected subgraph that gives your agent maximum context.

Benchmarks

LongMemEval (ICLR 2025) โ€” Industry Standard

500 questions across 53 conversation sessions. The gold standard for AI memory retrieval.

System R@5 R@10 NDCG@10
Quantum Memory Graph (gte-large) 96.6% 98.7% 94.3%
MemPalace raw 96.6% 98.2% 88.9%
Quantum Memory Graph (e5-large) 96.0% 98.1% 94.6%
Quantum Memory Graph (bge-large) 95.9% 98.2% 94.0%
OMEGA 95.4% โ€” โ€”
Mastra OM 94.9% โ€” โ€”
Quantum Memory Graph (MiniLM, default) 93.4% 97.4% 90.8%

#1 to our knowledge. Tied on R@5, best R@10 and NDCG@10 among published results. Free. Open source.

Use model="thenlper/gte-large" for #1 accuracy, or model="intfloat/e5-large-v2" for best NDCG ranking.

MemCombine โ€” Combination Recall (250 Scenarios)

Method Coverage Evidence Recall F1 Perfect
Embedding Top-K 92.3% 93.9% 91.3% 181/250
Graph + QAOA 96.2% 97.7% 95.1% 212/250

Graph-aware quantum selection beats pure similarity by +3.8% on combination tasks.

Choosing a Model

Model Size GPU? R@5 Best For
all-MiniLM-L6-v2 (default) 90MB No 93.4% Laptops, CI/CD, quick prototyping
BAAI/bge-large-en-v1.5 1.3GB Recommended 95.9% Production servers with GPU
intfloat/e5-large-v2 1.3GB Recommended 96.0% Best ranking quality (NDCG)
thenlper/gte-large 1.3GB Recommended 96.6% Maximum retrieval accuracy
from quantum_memory_graph import MemoryGraph

# Default โ€” works everywhere, no GPU needed
mg = MemoryGraph()

# High accuracy โ€” needs ~2GB RAM, GPU speeds it up 60x
mg = MemoryGraph(model="thenlper/gte-large")

The default model runs on any machine. Larger models need more RAM and benefit from a GPU but aren't required โ€” they'll just be slower on CPU.

Install

pip install quantum-memory-graph

Quick Start

from quantum_memory_graph import store, recall

# Store memories โ€” automatically builds knowledge graph
store("Project Alpha uses React frontend with TypeScript.")
store("Project Alpha backend is FastAPI with PostgreSQL.")
store("FastAPI connects to PostgreSQL via SQLAlchemy ORM.")
store("React components use Material UI for styling.")
store("Team had pizza for lunch. Pepperoni was great.")

# Recall โ€” graph traversal + QAOA finds the optimal combination
result = recall("What is Project Alpha's full tech stack?", K=4)

for memory in result["memories"]:
    print(f"  {memory['text']}")
    print(f"    Connected to {len(memory['connections'])} other selected memories")

Output: Returns React, FastAPI, PostgreSQL, and SQLAlchemy memories โ€” connected, complete, no noise. The pizza memory is excluded because it has no graph connections to the tech stack cluster.

How It Works

Query: "What's the tech stack?"
        โ”‚
        โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  1. Graph Search     โ”‚  Embedding similarity + multi-hop traversal
โ”‚     Find neighbors   โ”‚  Discovers memories connected to relevant ones
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ”‚ 14 candidates
         โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  2. Subgraph Data    โ”‚  Extract adjacency matrix + relevance scores
โ”‚     Build problem    โ”‚  Encode relationships as optimization weights
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ”‚ NP-hard selection
         โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  3. QAOA Optimize    โ”‚  Quantum approximate optimization
โ”‚     Find best K      โ”‚  Maximizes: relevance + connectivity + coverage
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ”‚ K memories
         โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  4. Return with      โ”‚  Each memory includes its connections
โ”‚     relationships    โ”‚  to other selected memories
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Why Quantum?

Optimal subgraph selection is NP-hard. Given N candidate memories, finding the best K that maximize relevance, connectivity, AND coverage has exponential classical complexity. QAOA provides polynomial-time approximate solutions that beat greedy heuristics โ€” this is the one problem where quantum computing has a genuine algorithmic advantage over classical approaches.

Architecture

Three Layers

  1. Knowledge Graph (graph.py) โ€” Memories are nodes. Relationships are weighted edges based on:

    • Semantic similarity (embedding cosine distance)
    • Entity co-occurrence (shared people, projects, concepts)
    • Temporal proximity (memories close in time)
    • Source proximity (same conversation/document)
  2. Subgraph Optimizer (subgraph_optimizer.py) โ€” QAOA circuit that maximizes:

    • ฮฑ ร— relevance (individual memory scores)
    • ฮฒ ร— connectivity (edge weights within selected subgraph)
    • ฮณ ร— coverage (topic diversity across selection)
  3. Pipeline (pipeline.py) โ€” Unified store() and recall() interface.

Optional: MemPalace Integration

Use MemPalace (MIT, by @bensig) as the storage/retrieval backend for 96.6% base retrieval quality:

from quantum_memory_graph.mempalace_bridge import store_memory, recall_memories

# MemPalace stores verbatim โ†’ ChromaDB retrieves candidates โ†’ QAOA selects optimal subgraph
result = recall_memories("What happened in the meeting?", K=5, use_qaoa=True)

API Server

pip install quantum-memory-graph[api]
python -m quantum_memory_graph.api

Endpoints:

  • POST /store โ€” Store a memory
  • POST /recall โ€” Graph + QAOA recall
  • POST /store-batch โ€” Batch store
  • GET /stats โ€” Graph statistics
  • GET / โ€” Health check

Advanced Usage

Custom Graph

from quantum_memory_graph import MemoryGraph, recall
from quantum_memory_graph.pipeline import set_graph

# Tune similarity threshold for edge creation
graph = MemoryGraph(similarity_threshold=0.25)
set_graph(graph)

# Store and recall as normal

Tune QAOA Parameters

result = recall(
    "query",
    K=5,
    alpha=0.4,       # Relevance weight
    beta_conn=0.35,   # Connectivity weight  
    gamma_cov=0.25,   # Coverage/diversity weight
    hops=3,           # Graph traversal depth
    top_seeds=7,      # Initial seed nodes
    max_candidates=14, # Max qubits for QAOA
)

Run MemCombine Benchmark

from benchmarks.memcombine import run_benchmark

def my_recall(memories, query, K):
    # Your recall implementation
    return selected_indices  # List[int]

results = run_benchmark(my_recall, K=5)
print(f"Coverage: {results['avg_coverage']*100:.1f}%")

Deploying for AI Agents

Replace Your Current Memory System

QMG is a drop-in upgrade for existing memory systems (Mem0, LangChain memory, custom RAG):

# Before (typical flat similarity search)
results = memory.search("What's the tech stack?", k=5)

# After (graph-aware combination retrieval)
from quantum_memory_graph import store, recall

result = recall("What's the tech stack?", K=5)
# Returns connected memory clusters, not just individual matches

Run as a Microservice

Deploy the API server for multiple agents to share:

pip install quantum-memory-graph[api]

# Default model (lightweight, no GPU)
python -m quantum_memory_graph.api --port 8502

# High accuracy (needs GPU for best speed)
QMG_MODEL=thenlper/gte-large python -m quantum_memory_graph.api --port 8502

Then from any agent:

import requests

# Store a memory
requests.post("http://localhost:8502/store", json={"text": "User prefers dark mode"})

# Recall with graph + QAOA
result = requests.post("http://localhost:8502/recall", json={"query": "What are the user's preferences?", "K": 5})

Migrate from Mem0 / LangChain

from quantum_memory_graph import store

# Export your existing memories and bulk import
for memory in existing_memories:
    store(memory["text"], metadata=memory.get("metadata"))
# Graph connections are built automatically during import

Production Tips

  • Shared API server: Run one instance, point all agents at it. The knowledge graph is shared โ€” Agent A's memories help Agent B's recall.
  • Model choice: Use gte-large on GPU servers (96.6% accuracy). Use default MiniLM on laptops or CI (93.4%, no GPU needed).
  • Batch import: Use /store-batch endpoint for bulk migration โ€” 10x faster than individual stores.
  • Persistence: Graph state saves to disk automatically. Restart the server without losing memories.

IBM Quantum Hardware

For production workloads, run QAOA on real quantum hardware:

pip install quantum-memory-graph[ibm]
export IBM_QUANTUM_TOKEN=your_token

Validated on ibm_fez and ibm_kingston backends.

Requirements

  • Python โ‰ฅ 3.9
  • sentence-transformers
  • networkx
  • qiskit + qiskit-aer
  • numpy

License

MIT License โ€” Copyright 2026 Coinkong (Chef's Attraction)

Built with MemPalace by @bensig (MIT License). See THIRD-PARTY-LICENSES.

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

quantum_memory_graph-0.4.0-py3-none-any.whl (62.8 kB view details)

Uploaded Python 3

File details

Details for the file quantum_memory_graph-0.4.0-py3-none-any.whl.

File metadata

File hashes

Hashes for quantum_memory_graph-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 35e4f2fcc063fe233b34a85b953b7307c7b1fcdf6916f48f4fdf5ae4ebc7297d
MD5 679c7124b1177870587faaee36afe438
BLAKE2b-256 2f7e21d0ef75a427df43b21fc15ccae61d838995f6d408df4a92e29fcb91aeee

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page