Skip to main content

AI memory layer powered by GrafeoDB's hybrid graph and vector database

Project description

CI codecov PyPI License

grafeo-memory

AI memory layer powered by GrafeoDB, an embedded graph database with native vector search.

No servers, no Docker, no Neo4j, no Qdrant. One .db file + one LLM.

Typical memory stack: Containers with Neo4j + Qdrant, Embedding API + LLM
grafeo-memory stack:  grafeo (single file) + LLM

Install

uv add grafeo-memory                   # base (bring your own LLM + embedder)
uv add grafeo-memory[mistral]          # + Mistral embeddings
uv add grafeo-memory[openai]           # + OpenAI embeddings
uv add grafeo-memory[anthropic]        # + Anthropic embeddings
uv add grafeo-memory[mcp]             # + MCP server for AI agents
uv add grafeo-memory[all]              # all providers

Or with pip:

pip install grafeo-memory[openai]

Quick Start

OpenAI

from openai import OpenAI
from grafeo_memory import MemoryManager, MemoryConfig, OpenAIEmbedder

embedder = OpenAIEmbedder(OpenAI())
config = MemoryConfig(db_path="./memory.db", user_id="alice")

with MemoryManager("openai:gpt-4o-mini", config, embedder=embedder) as memory:
    # Add memories from conversation
    events = memory.add("I just started a new job at Acme Corp as a data scientist")
    # -> [ADD "alice works at acme_corp", ADD "alice is a data_scientist"]

    events = memory.add("I've been promoted to senior data scientist at Acme")
    # -> [UPDATE "alice is a senior data scientist at acme_corp"]

    events = memory.add("I left Acme and joined Beta Inc")
    # -> [DELETE "alice works at acme_corp", ADD "alice works at beta_inc"]

    # Search
    results = memory.search("Where does Alice work?")
    # -> [SearchResult(text="alice works at beta_inc", score=0.92, ...)]

Mistral

from mistralai import Mistral
from grafeo_memory import MemoryManager, MemoryConfig, MistralEmbedder

embedder = MistralEmbedder(Mistral())
config = MemoryConfig(db_path="./memory.db", user_id="alice")

with MemoryManager("mistral:mistral-small-latest", config, embedder=embedder) as memory:
    events = memory.add("I just started a new job at Acme Corp as a data scientist")
    results = memory.search("Where does Alice work?")

How It Works

grafeo-memory implements the reconciliation loop, the intelligence layer that decides what to remember:

  1. Extract facts from conversation text (LLM call)
  2. Extract entities and relationships (LLM tool call)
  3. Search existing memory for related facts (vector + graph)
  4. Reconcile new facts against existing memory (LLM decides ADD/UPDATE/DELETE/NONE)
  5. Execute the decisions against GrafeoDB
┌──────────────────────────────────────────┐
│             grafeo-memory                │
│                                          │
│  Extractor -> Reconciler -> Executor     │
│  (LLM)       (LLM)        (GrafeoDB)     │
└──────────────────┬───────────────────────┘
                   │
         ┌─────────┴──────────┐
         │      GrafeoDB      │
         │  Graph + Vector    │
         │  + Text (optional) │
         │  single .db file   │
         └────────────────────┘

Multi-User Isolation

config = MemoryConfig(db_path="./chat_memory.db")

with MemoryManager("openai:gpt-4o-mini", config, embedder=embedder) as memory:
    # Each user's memories are isolated
    memory.add("I love hiking in the mountains", user_id="bob")
    memory.add("I prefer beach vacations", user_id="carol")

    bob_results = memory.search("vacation preferences", user_id="bob")
    # -> hiking, mountains

    carol_results = memory.search("vacation preferences", user_id="carol")
    # -> beach vacations

Supported LLM Providers

grafeo-memory uses pydantic-ai model strings, so any provider pydantic-ai supports works out of the box:

# OpenAI — use OpenAIEmbedder for embeddings
MemoryManager("openai:gpt-4o-mini", config, embedder=OpenAIEmbedder(OpenAI()))

# Anthropic — pair with OpenAI or custom embedder
MemoryManager("anthropic:claude-sonnet-4-5-20250929", config, embedder=embedder)

# Groq — pair with OpenAI or custom embedder
MemoryManager("groq:llama-3.3-70b-versatile", config, embedder=embedder)

# Mistral — use MistralEmbedder for embeddings
MemoryManager("mistral:mistral-small-latest", config, embedder=MistralEmbedder(Mistral()))

# Google — pair with OpenAI or custom embedder
MemoryManager("google-gla:gemini-2.0-flash", config, embedder=embedder)

Built-in Embedders

Class Provider Default Model Install Extra
OpenAIEmbedder OpenAI text-embedding-3-small [openai]
MistralEmbedder Mistral mistral-embed [mistral]

Both accept an optional model parameter to override the default.

Custom Embeddings

Implement the EmbeddingClient protocol to use any embedding provider:

from grafeo_memory import EmbeddingClient

class MyEmbedder:
    def embed(self, texts: list[str]) -> list[list[float]]:
        # Call your embedding API
        return [...]

    @property
    def dimensions(self) -> int:
        return 1024  # your model's output dimensions

memory = MemoryManager("openai:gpt-4o-mini", config, embedder=MyEmbedder())

MCP Server

grafeo-memory includes a built-in MCP server so AI agents (Claude Desktop, Cursor, etc.) can use it as a tool.

uv add grafeo-memory[mcp]
# or: pip install grafeo-memory[mcp]

Claude Desktop

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "grafeo-memory": {
      "command": "grafeo-memory-mcp",
      "env": {
        "GRAFEO_MEMORY_MODEL": "openai:gpt-4o-mini",
        "GRAFEO_MEMORY_DB": "./memory.db"
      }
    }
  }
}

Available Tools

Tool Description
memory_add Add a memory by extracting facts from text
memory_add_batch Add multiple memories in one batch
memory_search Search memories by semantic similarity and graph context
memory_update Update an existing memory's text
memory_delete Delete a single memory
memory_delete_all Delete all memories for a user
memory_list List all stored memories
memory_summarize Consolidate old memories into topic-grouped summaries
memory_history Show change history for a memory

Environment Variables

Variable Default Description
GRAFEO_MEMORY_MODEL openai:gpt-4o-mini pydantic-ai model string
GRAFEO_MEMORY_DB (in-memory) Database file path
GRAFEO_MEMORY_USER default Default user ID
GRAFEO_MEMORY_YOLO (off) Set to 1 for all features

Transport

Supports stdio (default), SSE and streamable HTTP:

grafeo-memory-mcp              # stdio (default)
grafeo-memory-mcp sse          # SSE
grafeo-memory-mcp streamable-http

Note: This is different from grafeo-mcp, which exposes the raw GrafeoDB database. grafeo-memory-mcp wraps the high-level memory API (extract, reconcile, search, summarize).

Observability

grafeo-memory supports OpenTelemetry instrumentation via pydantic-ai. When enabled, all LLM calls (extraction, reconciliation, summarization, reranking) are traced automatically.

config = MemoryConfig(instrument=True)  # uses global OTel provider

For custom providers:

from grafeo_memory import InstrumentationSettings

config = MemoryConfig(instrument=InstrumentationSettings(
    tracer_provider=my_tracer_provider,
    include_content=False,
))

Why grafeo-memory?

Traditional stack grafeo-memory
Infrastructure Neo4j + Qdrant (Docker) Single .db file
Install size ~750MB (Docker images) ~16MB (uv add)
Offline/edge Requires servers Yes
Graph + vector Separate services Unified engine
LLM providers Varies pydantic-ai (OpenAI, Anthropic, Mistral, Groq, Google)
Embeddings External API required Protocol-based (any provider)

API Reference

MemoryManager

  • MemoryManager(model, config=None, *, embedder): create memory manager. model is a pydantic-ai model string (e.g. "openai:gpt-4o-mini")
  • .add(messages, user_id=None, session_id=None, metadata=None, *, infer=True, importance=1.0, memory_type="semantic")AddResult (list of MemoryEvent)
  • .search(query, user_id=None, k=10, *, filters=None, rerank=True, memory_type=None)SearchResponse (list of SearchResult)
  • .update(memory_id, text)MemoryEvent: update a memory's text directly
  • .get_all(user_id=None, memory_type=None)list[SearchResult]
  • .delete(memory_id)bool
  • .delete_all(user_id=None)int (count deleted)
  • .summarize(user_id=None, *, preserve_recent=5, batch_size=20)AddResult
  • .history(memory_id)list[HistoryEntry]
  • .set_importance(memory_id, importance)bool
  • .close(): close the database

Use as a context manager: with MemoryManager(...) as memory:. Multiple sessions in the same process are supported.

MemoryConfig

  • db_path: path to database file (None for in-memory)
  • user_id: default user scope (default "default")
  • session_id: default session scope
  • agent_id: default agent scope
  • reconciliation_threshold: minimum similarity for reconciliation candidates (default 0.3)
  • search_min_score: minimum score for search results, 0.0 returns everything (default 0.0)
  • agreement_bonus: score boost when both vector and graph find the same memory (default 0.1)
  • embedding_dimensions: vector dimensions (default 1536)
  • enable_importance: enable composite scoring with recency/frequency/importance (default False)
  • weight_topology: topology score weight for graph-connected memories (default 0.0, requires enable_importance)
  • enable_topology_boost: re-rank search results by graph connectivity, no LLM call (default False)
  • topology_boost_factor: strength of topology boost (default 0.2)
  • consolidation_protect_threshold: protect well-connected memories from summarize (default 0.0, off)
  • instrument: OpenTelemetry instrumentation, True or InstrumentationSettings (default False)

EmbeddingClient (Protocol)

  • .embed(texts: list[str]) -> list[list[float]]: generate embeddings for a batch of texts
  • .dimensions -> int: return the embedding vector dimensionality

Return Types

  • AddResult: list subclass of MemoryEvent, with .usage for LLM token counts
  • SearchResponse: list subclass of SearchResult, with .usage for LLM token counts
  • MemoryEvent: .action (ADD/UPDATE/DELETE/NONE), .memory_id, .text, .old_text
  • SearchResult: .memory_id, .text, .score, .user_id, .metadata, .relations, .memory_type
  • HistoryEntry: .event, .old_text, .new_text, .timestamp, .actor_id, .role

Iteration

# AddResult is iterable:
for event in memory.add("text"):
    print(event.action, event.text)

# SearchResponse is iterable:
for result in memory.search("query"):
    print(result.text, result.score)

Ecosystem

grafeo-memory is part of the GrafeoDB ecosystem:

  • grafeo: Core graph database engine (Rust)
  • grafeo-langchain: LangChain integration
  • grafeo-llamaindex: LlamaIndex integration
  • grafeo-mcp: MCP server for raw GrafeoDB access
  • grafeo-memory-mcp (built-in): MCP server for the memory API (uv add grafeo-memory[mcp] or pip install grafeo-memory[mcp])

All packages share the same .db file. Build memories with grafeo-memory, query them with grafeo-langchain, expose them via MCP.

Requirements

  • Python 3.12+

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

grafeo_memory-0.2.0.tar.gz (195.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

grafeo_memory-0.2.0-py3-none-any.whl (68.7 kB view details)

Uploaded Python 3

File details

Details for the file grafeo_memory-0.2.0.tar.gz.

File metadata

  • Download URL: grafeo_memory-0.2.0.tar.gz
  • Upload date:
  • Size: 195.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for grafeo_memory-0.2.0.tar.gz
Algorithm Hash digest
SHA256 da7bff7f60c30a453956e8e46b1d741cac60ee9e901549033976e7bfc7acbf5e
MD5 1f371c59e6d10fbcd6a18020425cd8cd
BLAKE2b-256 c29b5b3d330ad046982a1f7d8496ba406faf0615444ec385213ad6224cd39b13

See more details on using hashes here.

Provenance

The following attestation bundles were made for grafeo_memory-0.2.0.tar.gz:

Publisher: pypi.yml on GrafeoDB/grafeo-memory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file grafeo_memory-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: grafeo_memory-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 68.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for grafeo_memory-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0ea5162b1e77148198eebb826f2ad77c1c41a96d51d9f94fa680156e309ca1b5
MD5 8c863bb8a014f82eb630d47a27503e47
BLAKE2b-256 9a96609c86750dbe4b8471217239ec64832fedc22d35e2c795ea43bf96258cb1

See more details on using hashes here.

Provenance

The following attestation bundles were made for grafeo_memory-0.2.0-py3-none-any.whl:

Publisher: pypi.yml on GrafeoDB/grafeo-memory

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page