Local-first dynamic memory field with vector search and wave-field re-ranking

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

caspian

These details have not been verified by PyPI

Project description

WaveMind

Local-first dynamic memory for apps, agents, notebooks, and tools.

WaveMind stores memories in SQLite, finds relevant candidates with vector search, then uses a wave-field priority layer to decide what still matters: hot facts rise, stale facts fade, temporary facts expire, and namespaces keep users or projects isolated.

Python License

What Is WaveMind?

WaveMind is a dynamic memory engine you can embed in a product.

Use it when your app needs to remember things like user preferences, decisions, corrections, notes, research snippets, support history, agent context, or temporary facts.

The short version:

normal vector search:  find the nearest text
WaveMind:              find the nearest useful memory

WaveMind is not trying to replace every vector database. It is the memory layer around retrieval: persistence, namespaces, TTL, hotness, priority, decay, explicit forgetting, audit events, and optional graph dynamics.

60-Second Version

Question	Answer
What does it store?	Text memories, vectors, metadata, tags, TTL, priority, and recall state.
Where does it store data?	A local SQLite file by default; Postgres is available for production state.
How do I use it?	CLI, Python API, FastAPI HTTP server, LangChain memory, or framework adapters.
What is different from Chroma/Qdrant?	WaveMind adds memory policy: hotness, decay, TTL, correction handling, and scoped recall.
When should I not use it?	For huge static document search where a mature vector DB is already the right tool.
What is the simplest install?	`python -m pip install wavemind`

Why Use It?

If you need...	WaveMind gives you...
Memory that survives restarts	One SQLite file stores text, vectors, metadata, TTL, and recall state.
Per-user or per-project recall	Namespaces and tags keep memories separated.
Temporary facts	`ttl_seconds` lets facts expire automatically.
Corrections and changing preferences	Newer or reinforced memories can outrank stale ones.
A simple integration path	Python API, CLI, FastAPI server, and LangChain memory class.
Production hygiene	Backups, audit log, API keys, rate limits, Prometheus metrics, and OpenTelemetry traces.

Quick Start

The shortest path from install to first recall:

python -m pip install wavemind
wavemind remember "Andrey is a trader" --namespace demo
wavemind query "What does Andrey do?" --namespace demo

Need a reminder after install?

wavemind quickstart

Want to see and manage memory in a browser?

wavemind studio

By default, WaveMind creates wavemind.sqlite3 in the current working directory. That file is the local source of truth. Keep it out of git and back it up like application state.

CLI Cheat Sheet

Start here if you only want to use WaveMind from the terminal:

Goal	Command
Show first-run help	`wavemind quickstart`
Store a memory	`wavemind remember "Andrey prefers short answers" --namespace user:42`
Search memory	`wavemind query "answer style" --namespace user:42`
Open local dashboard	`wavemind studio`
See stored state	`wavemind stats --namespace user:42`
Delete a namespace	`wavemind forget --namespace user:42`
Import notes	`wavemind import ./notes.txt --namespace project:alpha`
Use another database file	`wavemind --db ./state/memory.sqlite3 query "budget" --namespace user:42`
Start the HTTP API	`wavemind --db ./state/memory.sqlite3 serve --host 127.0.0.1 --port 8000`

After this point, choose the integration path you need: Python, HTTP, LangChain, framework adapters, benchmarks, or production deployment.

WaveMind Studio

WaveMind Studio is the built-in local dashboard. It runs on top of the same FastAPI app and SQLite database as the CLI:

wavemind studio

It opens http://127.0.0.1:8000/studio and gives you:

View	What it is for
Memory map	See field energy as a heatmap.
Namespace explorer	Inspect memories per user, project, agent, or tenant.
Live query tester	Test recall before wiring it into an app.
Feedback buttons	Mark recalled memories as useful or not useful.
Import/export	Import local files and export a namespace snapshot.
Backup	Create SQLite backups from the browser.
Conflict visualizer	Inspect correction groups when memories disagree.

For a server-safe local bind:

wavemind --db ./state/wavemind.sqlite3 studio --host 127.0.0.1 --port 8000

Python Example

from wavemind import WaveMind

memory = WaveMind(db_path="./state/wavemind.sqlite3")

memory.remember(
    "The user prefers short practical answers.",
    namespace="user:42",
    tags=["preference"],
)

hits = memory.query("How should I answer this user?", namespace="user:42", top_k=3)
for hit in hits:
    print(hit.score, hit.text)

The integration pattern is intentionally small:

Call query() before your app, agent, tool, or UI needs context.
Pass the returned memories into your prompt, screen, search result, or decision function.
Call remember() after something worth keeping happens.

HTTP Example

The FastAPI server is included in the base install:

wavemind --db ./state/wavemind.sqlite3 serve --host 127.0.0.1 --port 8000

Then use WaveMind from any language:

curl -X POST http://127.0.0.1:8000/remember \
  -H "Content-Type: application/json" \
  -d "{\"text\":\"Andrey prefers short answers\",\"namespace\":\"user:42\",\"tags\":[\"preference\"]}"

curl -X POST http://127.0.0.1:8000/query \
  -H "Content-Type: application/json" \
  -d "{\"query\":\"How should I answer?\",\"namespace\":\"user:42\",\"top_k\":3}"

Where Data Lives

WaveMind is local-first. The SQLite database stores memories, vectors, metadata, namespaces, tags, TTL, hotness, priority, and audit events.

runtime	Suggested database path
quick CLI experiment	`./wavemind.sqlite3`
Python app or agent	`./state/wavemind.sqlite3`
desktop app	user data directory, for example `%APPDATA%` or `~/.local/share`
server daemon	`/var/lib/wavemind/wavemind.sqlite3`
Docker	mounted volume, for example `/data/wavemind.sqlite3`

Explicit path:

wavemind --db ./state/app_memory.sqlite3 remember "Andrey prefers short answers" --namespace user:42
wavemind --db ./state/app_memory.sqlite3 query "answer style" --namespace user:42

Common Ways To Use It

You are building...	Start with...
Python app	`from wavemind import WaveMind`
LangChain agent	`WaveMindMemory` from `wavemind.integrations.langchain`
LangGraph workflow	`make_recall_node()` and `make_persist_node()`
LlamaIndex pipeline	`WaveMindRetriever`
CrewAI or AutoGen loop	The adapters in `wavemind.integrations`
Node, Go, Ruby, PHP, or no-code app	`wavemind serve` and the HTTP API
Personal knowledge base	Store notes by project namespace and query locally
Support or CRM workflow	Store customer issues, resolutions, preferences, and corrections
Research or trading notebook	Store observations with source metadata and TTL for temporary hypotheses

For migrations from existing local vector memory, start with docs/CHROMA_MIGRATION.md.

Minimal Agent Loop

from wavemind import WaveMind

memory = WaveMind(db_path="./state/agent.sqlite3")

def run_turn(user_id: str, user_text: str) -> str:
    namespace = f"user:{user_id}"
    hits = memory.query(user_text, namespace=namespace, top_k=5, min_score=0.25)
    recalled = "\n".join(f"- {hit.text}" for hit in hits)

    answer = call_your_llm(f"Relevant memory:\n{recalled}\n\nUser: {user_text}")

    memory.remember(f"User said: {user_text}", namespace=namespace, tags=["conversation"])
    memory.remember(f"Assistant answered: {answer}", namespace=namespace, tags=["conversation"])
    return answer

Terminal Demo

From a cloned repository:

$ python examples/demo.py
[ok] Remembered: "Andrey is a trader who tracks market breakouts."
[ok] Remembered: "Andrey prefers short practical answers about product decisions."

Query: "Andrey trader preferences"
-> Result 1 (0.60): "Andrey is a trader who tracks market breakouts."
-> Result 2 (0.30): "Andrey prefers short practical answers about product decisions."

The demo is offline, keyless, and uses the built-in hash encoder.

To see the behavior that plain vector search does not provide:

python examples/dynamic_memory_demo.py

That demo shows corrected facts outranking stale facts, temporary memory expiring, namespace isolation, and index-health reporting.

How The Memory Field Works

flowchart LR
    A["Text, event, note, document, or agent turn"] --> S["remember()"]
    S --> D[("SQLite: text + metadata + vectors + memory state")]
    Q["query()"] --> K["k-NN candidate search"]
    D --> K
    K --> W["wave-field re-rank"]
    W --> R["small ranked recall set"]
    R --> P["app, search UI, prompt, API, or tool"]
    P --> F["recall feedback updates hotness / priority"]
    F --> D

The wave field is the dynamic layer around stored memories. It is not a replacement for embeddings; it is the policy that decides which candidate memories should still matter.

signal	Plain meaning	Effect
vector similarity	This text is semantically close to the query.	Gets into the candidate set.
hotness	This memory has been useful before.	Moves upward during recall.
decay	This memory has not mattered recently.	Slowly loses influence.
priority	The app says this fact is important.	Raises ranking even before repetition.
TTL	This fact is temporary.	Drops out after expiry.
namespace and tags	This belongs to one user/project/type.	Prevents cross-user or cross-topic leakage.
graph dynamics	Related memories can excite or inhibit each other.	Helps clusters and corrections behave like memory, not a flat list.

Technically, the current MemoryFieldGraph is a discrete graph over stored memories, not a continuous mathematical physics field. That honesty matters: WaveMind is useful today as a dynamic memory engine, while the research path is to make the field dynamics more explicit, measurable, and scalable.

Optional Embeddings

For sentence-transformer embeddings:

python -m pip install "wavemind[sentence]"
wavemind --encoder sentence remember "Andrey is a trader" --namespace demo
wavemind --encoder sentence query "What does Andrey do?" --namespace demo

Optional Index Backends

The default index is NumPy exact search. It is simple and reliable for local memory. For larger candidate generation, WaveMind also exposes optional index backends:

index	Install	Notes
`numpy`	default	Exact cosine search, local, linear scan.
`quantized`	default	Local int8-compressed candidate index. Useful for memory-footprint experiments; current kernel is approximate and not yet faster than NumPy.
`annoy`	`pip install "wavemind[indexes]"`	Local ANN. Faster at larger N, but recall must be checked.
`faiss`	`pip install "wavemind[indexes]"`	FAISS flat inner-product path where `faiss-cpu` is available.
`faiss-persisted`	`pip install "wavemind[indexes]"`	FAISS with an explicit persisted index snapshot and id map.
`pgvector`	`pip install "wavemind[postgres]"`	PostgreSQL/pgvector candidate index. SQLite can still remain the local source of truth.
`qdrant`	`pip install "wavemind[indexes]"`	Qdrant service/local-mode candidate index. SQLite remains the source of truth; Qdrant stores vectors.

Persisted FAISS setup:

export WAVEMIND_FAISS_PATH="./state/wavemind.faiss"
wavemind --index faiss-persisted remember "Andrey is a trader" --namespace demo
wavemind --index faiss-persisted query "trader" --namespace demo

SQLite or Postgres remains the source of truth. The persisted FAISS files are a candidate-index snapshot and are validated against the current memory ids on load. If the snapshot does not match the stored memories, WaveMind rebuilds it. You can also check and rebuild the candidate index explicitly:

wavemind --index faiss-persisted index-health --json
wavemind --index faiss-persisted rebuild-index

Index health compares durable memory ids against the candidate index. Local indexes report exact missing/extra ids; service backends report exact ids when the backend exposes an id scan and otherwise fall back to count-based health.

pgvector setup:

export WAVEMIND_PGVECTOR_DSN="postgresql://user:password@localhost:5432/wavemind"
wavemind --index pgvector remember "Andrey is a trader" --namespace demo
wavemind --index pgvector query "trader" --namespace demo

Optional pgvector environment variables:

WAVEMIND_PGVECTOR_TABLE - table name, default wavemind_vectors.
WAVEMIND_PGVECTOR_COLLECTION - collection key, default default.
WAVEMIND_PGVECTOR_CREATE_HNSW=1 - create an HNSW index using vector_cosine_ops when the installed pgvector version supports it.

If WAVEMIND_PGVECTOR_DSN is missing, WaveMind raises a clear error instead of silently falling back to another index backend. The pgvector table is created with the current encoder dimension, so use a separate table when switching between different vector sizes.

Qdrant setup:

export WAVEMIND_QDRANT_URL="http://localhost:6333"
export WAVEMIND_QDRANT_COLLECTION="wavemind_vectors"
wavemind --index qdrant remember "Andrey is a trader" --namespace demo
wavemind --index qdrant query "trader" --namespace demo

For local experiments you can set WAVEMIND_QDRANT_URL=":memory:", but production latency and durability should be measured against a real Qdrant service. If WAVEMIND_QDRANT_URL is missing, WaveMind raises a clear error instead of silently falling back to another backend.

Storage Backends

SQLite is the default source of truth. For multi-tenant production deployments, WaveMind also exposes PostgreSQL as an explicit source-of-truth backend:

export WAVEMIND_STORE="postgres"
export WAVEMIND_POSTGRES_DSN="postgresql://user:password@localhost:5432/wavemind"
wavemind --store postgres remember "Andrey is a trader" --namespace user:andrey
wavemind --store postgres query "trader" --namespace user:andrey

Optional table environment variables:

WAVEMIND_POSTGRES_MEMORIES_TABLE, default wavemind_memories.
WAVEMIND_POSTGRES_AUDIT_TABLE, default wavemind_audit_events.

Postgres storage is separate from pgvector: Postgres storage keeps memories, metadata, TTL, audit events, and vectors as durable application state; pgvector is a candidate index backend for nearest-neighbor search. You can use SQLite storage with pgvector, Postgres storage with NumPy/FAISS/Qdrant, or eventually Postgres storage plus pgvector when you want both state and vector search inside PostgreSQL.

Backup And Restore

Exact one-file backup:

wavemind --db ./state/wavemind.sqlite3 backup --out ./backups/wavemind.sqlite3

Timestamped backups with retention:

wavemind --db ./state/wavemind.sqlite3 backup --out ./backups --prefix wavemind --keep-last 7

Restore into a new or replacement SQLite file:

wavemind restore --from ./backups/wavemind-20260630-120000.sqlite3 --to ./state/wavemind.sqlite3 --overwrite

The backup command uses SQLite's backup API, so it is safe to run while the process is alive. Restore is intentionally an explicit command and refuses to overwrite an existing database unless --overwrite is passed. For Postgres storage, use database-native backup tooling such as pg_dump, managed snapshots, or point-in-time recovery instead of WaveMind's SQLite file backup command.

HTTP API

Run the local FastAPI server:

wavemind --db ./app_memory.sqlite3 serve --host 127.0.0.1 --port 8000

Store and query memory over HTTP:

curl -X POST http://127.0.0.1:8000/remember -H "Content-Type: application/json" -d "{\"text\":\"Andrey is a trader\",\"namespace\":\"demo\"}"
curl -X POST http://127.0.0.1:8000/query -H "Content-Type: application/json" -d "{\"query\":\"trader\",\"namespace\":\"demo\",\"top_k\":1}"

Operational endpoints:

curl http://127.0.0.1:8000/stats?namespace=demo
curl http://127.0.0.1:8000/audit?namespace=demo
curl http://127.0.0.1:8000/metrics
curl http://127.0.0.1:8000/observability
curl http://127.0.0.1:8000/index/health
curl -X POST http://127.0.0.1:8000/index/rebuild
curl -X POST http://127.0.0.1:8000/backup -H "Content-Type: application/json" -d '{"path":"./backups","keep_last":7}'

/audit returns mutation events such as remember, forget, backup, and purge_expired. Query audit is opt-in with WAVEMIND_AUDIT_QUERIES=1 because writing an audit row for every query changes latency. /metrics returns a Prometheus-compatible text payload without adding a required dependency. /index/health reports source-of-truth versus candidate-index consistency. /index/rebuild rebuilds the candidate index from stored active memories and logs an index_rebuild audit event.

OpenTelemetry traces are optional and off by default:

pip install "wavemind[otel]"
export WAVEMIND_OTEL_ENABLED=1
export WAVEMIND_OTEL_SERVICE_NAME=wavemind-api
export WAVEMIND_OTEL_EXPORTER=otlp
export WAVEMIND_OTEL_ENDPOINT="http://localhost:4318/v1/traces"
wavemind --db ./app_memory.sqlite3 serve --host 127.0.0.1 --port 8000

Use WAVEMIND_OTEL_EXPORTER=console for local trace inspection. FastAPI requests are instrumented, and core memory phases such as encode, index search, graph propagation, reranking, load, and backup create spans when OpenTelemetry is enabled.

Production API controls are opt-in:

export WAVEMIND_READ_KEYS="read-key"
export WAVEMIND_WRITE_KEYS="write-key"
export WAVEMIND_ADMIN_KEYS="admin-key"
export WAVEMIND_RATE_LIMIT_PER_MINUTE=120

Role behavior:

role	Env var	Allows
read	`WAVEMIND_READ_KEYS`	`/query`, `/stats`, `/metrics`, `/index/health`
write	`WAVEMIND_WRITE_KEYS`	read actions plus `/remember` and `/import`
admin	`WAVEMIND_ADMIN_KEYS` or `WAVEMIND_API_KEYS`	all actions, including `/audit`, `/backup`, `/index/rebuild`, and `/forget`

Keys are accepted through Authorization: Bearer <key> or X-API-Key: <key>. If no key env vars are set, authentication is disabled for local development.

Install From Source

For contributors installing from a local clone:

git clone https://github.com/CaspianG/wavemind.git
cd wavemind
python -m pip install -e ".[sentence]"

One-file setup scripts are also included in the repository:

sh install.sh

install.bat

LangChain Memory

Install the optional integration:

pip install "wavemind[langchain]"

Use WaveMind as a drop-in LangChain memory object:

from wavemind.integrations.langchain import WaveMindMemory

memory = WaveMindMemory(db_path="agent_memory.sqlite3")
# Replace: memory = ConversationBufferMemory()

Offline runnable example from a cloned repository:

python examples/langchain_memory.py

Framework Integrations

WaveMind only needs two touch points in an agent, service, notebook, or app:

Before work happens, query() for relevant memory and pass the short result into the next step: a prompt, search screen, tool call, support workflow, or decision function.
After work happens, remember() durable facts, preferences, summaries, outcomes, corrections, or notes.

That makes it usable in more than LangChain:

Use case	Integration style
LangChain agent	Use `WaveMindMemory` from `wavemind.integrations.langchain`.
LangGraph workflow	Use `make_recall_node()` and `make_persist_node()` from `wavemind.integrations.langgraph`.
LlamaIndex pipeline	Use `WaveMindRetriever` from `wavemind.integrations.llamaindex`.
CrewAI crew	Use `WaveMindCrewAITools` from `wavemind.integrations.crewai`.
AutoGen loop	Use `WaveMindAutoGenMemory` from `wavemind.integrations.autogen`.
Custom Python agent	Create one `WaveMind` instance and call `query()` before the LLM.
Node, Go, Ruby, PHP, or no-code app	Run `wavemind serve` and call the HTTP API.
Multi-user SaaS	Use `namespace="user:<id>"` or `namespace="tenant:<id>:agent:<id>"`.
Knowledge base or notebook	Store notes by project namespace and retrieve a small evidence set.
Support or CRM workflow	Store issues, preferences, resolutions, and corrections with tags.
Research workflow	Store observations with source metadata and expire temporary hypotheses.
Temporary context	Store with `ttl_seconds=...` so stale memory expires automatically.
Preference/profile memory	Store with tags such as `profile`, `preference`, `project`, `decision`.
Corrections/privacy	Use `forget()` or namespace deletion workflows.

More examples: docs/USE_CASES.md. Migrating from a Chroma memory store: docs/CHROMA_MIGRATION.md.

Framework examples in this repository:

Framework / pattern	Example
LangChain memory	`examples/langchain_memory.py`
OpenAI/OpenRouter-style agent loop	`examples/agent_with_memory.py`
LangGraph hooks	`wavemind.integrations.langgraph`, `examples/framework_integrations.py`
LlamaIndex-style retriever	`wavemind.integrations.llamaindex`, `examples/framework_integrations.py`
CrewAI-style tools	`wavemind.integrations.crewai`, `examples/framework_integrations.py`
AutoGen-style hooks	`wavemind.integrations.autogen`, `examples/framework_integrations.py`
Namespace sharding	`examples/sharded_memory.py`

OpenClaw Integration

OpenClaw memory is file-centered: it writes durable memory into MEMORY.md, daily notes under memory/, and uses tools such as memory_search / memory_get. OpenClaw's documented agent loop also exposes hooks such as before_prompt_build, agent_end, message_received, and message_sent.

The safest WaveMind integration is a sidecar, not a replacement:

Keep OpenClaw's Markdown memory as the human-readable source of durable truth.
Use WaveMind as the dynamic recall layer for hotness, TTL, namespaces, and correction-sensitive ranking.
Store the SQLite file outside committed workspace files, for example ~/.openclaw/wavemind/<agent-id>.sqlite3.
Query WaveMind from before_prompt_build and inject a compact memory block with prependContext.
Capture new durable summaries from agent_end or message hooks.

Sketch of the adapter logic:

from pathlib import Path
from wavemind import WaveMind

db_path = Path.home() / ".openclaw" / "wavemind" / "main.sqlite3"
memory = WaveMind(db_path=db_path)

def before_prompt_build(agent_id: str, user_text: str) -> str:
    namespace = f"openclaw:{agent_id}"
    hits = memory.query(user_text, namespace=namespace, top_k=5, min_score=0.25)
    return "\n".join(f"- {hit.text}" for hit in hits)

def agent_end(agent_id: str, summary: str) -> None:
    namespace = f"openclaw:{agent_id}"
    memory.remember(summary, namespace=namespace, tags=["summary"], priority=1.5)

For a production OpenClaw plugin, translate that sketch into the documented plugin hook surface: before_prompt_build for recall and agent_end / message_received / message_sent for capture.

Hermes and Custom Agent Loops

The public HERMES Agent is a LangChain / LangGraph mathematical-reasoning agent. Its README describes HermesReasoner as a LangChain BaseTool and mentions an optional in-memory embedding store for previously verified claims.

WaveMind fits there as a persistent memory layer around that loop:

Recall previously verified claims before HermesReasoner is invoked.
Store successfully verified claims with tags=["verified-claim"].
Scope by user_id, project, benchmark, or theorem namespace.
Replace short-lived in-memory vector recall when the agent needs restarts, TTL, explicit forgetting, or cross-session reuse.

Generic Hermes-style loop:

from wavemind import WaveMind

memory = WaveMind(db_path="./state/hermes_claims.sqlite3")

def verify_with_memory(user_id: str, problem: str) -> str:
    namespace = f"hermes:{user_id}"
    claims = memory.query(problem, namespace=namespace, tags=["verified-claim"], top_k=5)
    context = "\n".join(f"- {claim.text}" for claim in claims)

    result = call_hermes_reasoner(problem=problem, extra_context=context)

    if result.label == "CORRECT":
        memory.remember(result.claim, namespace=namespace, tags=["verified-claim"], priority=2.0)
    return result.text

For any other agent framework, the rule is the same: recall before the model, capture after the turn, isolate users with namespaces, and use TTL for temporary facts.

Non-Agent Use Cases

WaveMind can store any small-to-medium memory stream where meaning, freshness, and repeated use matter. It is useful when "show me the nearest text" is not enough and the application needs "show me what is relevant now."

Use case	Example
Support memory	Recall past user issues, plans, bugs, and resolutions.
Product research	Store interview snippets with `tags=["customer", "pain"]`.
Team knowledge	Remember project decisions and suppress expired decisions with TTL.
Personal assistant	Store preferences, routines, people, and recurring context.
Game/NPC memory	Give characters scoped memory that strengthens after repeated events.
Trading research	Store labeled OHLCV pattern notes before building a backtest layer.
Document notebook	Import text/PDF/JSON chunks and query by namespace/project.
Personal knowledge base	Keep decisions, recurring context, people, links, and notes searchable without sending them to a hosted vector DB.

Why Dynamic Memory

WaveMind is not positioned as "a faster Chroma." Chroma, Qdrant, Pinecone, and Weaviate are vector databases: they store embeddings and return nearest neighbors. That is the right tool for many static RAG workloads.

WaveMind is a dynamic memory layer. It still uses vector search first, but then applies memory-specific signals that a plain vector store does not model by default:

memory behavior	Why it matters	WaveMind mechanism
Hot memories	Information that keeps being useful should become easier to recall again.	Wave-field hotness and priority updates.
Aging memories	Old low-value facts should fade instead of competing forever.	TTL and decay-aware scoring.
Scoped memory	One user, app, workspace, or project should not leak into another.	Namespaces and tags.
Explicit forgetting	Real systems need deletion, privacy cleanup, and correction workflows.	`forget()` plus SQLite persistence.
Stable restart behavior	A memory system must survive process restarts.	SQLite source of truth, reloadable indexes.
Vector plus memory rank	Semantic similarity is necessary but not sufficient for long-running memory.	k-NN candidates first, wave field as re-ranker.

The current Chroma benchmark below is intentionally conservative: it compares static retrieval on the same facts and the same hash embeddings. That benchmark is useful, but it does not exercise WaveMind's main thesis: memory that changes over time as software recalls, reinforces, ages, and forgets information.

The benchmark that should decide whether WaveMind is worth using is a dynamic memory benchmark:

scenario	What should happen
A fact, preference, or decision is used many times.	WaveMind should rank it higher than equally similar but unused facts.
A fact expires via TTL.	WaveMind should suppress it without requiring manual vector cleanup.
A user or system corrects an old fact.	WaveMind should prefer the newer or reinforced memory.
A query is ambiguous across namespaces.	WaveMind should return only the scoped user's memory.
A long history has many irrelevant facts.	WaveMind should preserve useful recall instead of treating all vectors equally.

In short: static vector search answers "what is nearest?" Dynamic memory also asks "what is still relevant, reinforced, scoped, and allowed to be remembered?"

Benchmark

WaveMind tracks benchmarks in two layers:

Implemented local checks - fast, reproducible scripts that run from this repository and protect the core memory behavior.
Public benchmark roadmap - external retrieval and memory benchmarks that should decide whether WaveMind is competitive outside hand-made demos.

Machine-readable benchmark matrix: benchmarks/benchmark_matrix_results.json. Full generated benchmark report: benchmarks/BENCHMARK_REPORT.md.

Visual summary generated from the checked-in JSON results:

WaveMind benchmark summary

Regenerate the matrix and chart locally:

python benchmarks/benchmark_registry.py --output benchmarks/benchmark_matrix_results.json
python benchmarks/render_benchmark_charts.py --output docs/assets/benchmark-summary.svg

The chart shows completed local measurements plus the public benchmark roadmap. Planned public benchmarks stay out of the results section until the dataset, engine, and result JSON are committed.

Status legend:

implemented - script and checked-in result exist.
runner ready - adapter exists, but the official public dataset result is not checked in yet.
planned - benchmark is part of the public proof path, but no WaveMind result is claimed.

How to read the benchmark classes:

class	Popular examples	What it answers for WaveMind
Retrieval / embeddings	BEIR, MTEB Retrieval, MIRACL	Does WaveMind preserve normal vector-search quality on public qrels?
Vector index / database	ANN-Benchmarks, VectorDBBench	Is the candidate index fast enough at scale?
Agent memory	LoCoMo, LongMemEval, LongMemEval-V2, LMEB	Does WaveMind retrieve the right evolving memory across long histories?
RAG quality	RAGBench	Does dynamic memory improve final context and answer quality?

Current read:

area	result	honest interpretation
Public agent-memory evidence	On official LoCoMo `locomo10.json`, WaveMind reaches `evidence_recall@5 0.386` with hash embeddings and `0.547` with sentence-transformers. Fair namespace-filtered Chroma reaches `0.257` / `0.407`; Qdrant reaches `0.263` / `0.409`.	WaveMind retrieves more labeled evidence. Chroma is still the fastest static vector-store baseline. Qdrant local payload filtering is much slower than service-mode Qdrant should be.
Public retrieval sanity check	On BEIR SciFact, WaveMind reaches `nDCG@10 0.354`, `Recall@10 0.482`; Qdrant matches that quality; Chroma reaches `0.350` / `0.467` with identical hash embeddings.	Same-embedding retrieval quality is close. Chroma is fastest at `1.79 ms`; Qdrant local is `17.71 ms`; WaveMind exact path is `117.02 ms`.
Static agent recall	WaveMind `precision@1` equals Chroma at `0.82`; WaveMind `precision@3` is `0.90` vs Chroma `0.88`.	Competitive quality, but Chroma is faster on the static vector-store path.
Dynamic memory policy	WaveMind reaches `1.00` stale suppression; Chroma static is `0.00`.	This is the strongest current differentiation: hotness, TTL, corrections, and namespaces.
Field memory dynamics	Graph-enabled WaveMind reaches `1.00` `precision@1`, `1.00` stale suppression, and `1.00` concept formation vs static WaveMind at `0.20` / `0.20` / `0.00`.	This is still synthetic, but it is the first regression check for memory-to-memory excitation, conflict inhibition, and decay.
Long-term evidence	WaveMind reaches `1.00` evidence recall@5, `1.00` precision@1, and `1.00` stale suppression on the synthetic long-memory evidence benchmark.	This is the first proof-shaped benchmark for agent memory: it measures whether stale/corrected/expired/cross-user facts stay out of retrieved evidence.
Capacity	Static `precision@1` is `0.94` at 5000 memories; dynamic policy keeps `1.00` on the current checks.	Quality is holding on these checks, but dynamic latency must be optimized.
LongMemEval full retrieval	On the official LongMemEval-S cleaned file, 470 non-abstention session-level questions, WaveMind reaches `evidence_recall@5 0.782` and `precision@1 0.696`; Chroma static reaches `0.518` / `0.355`; Qdrant static reaches `0.520` / `0.355`.	This is now the strongest public memory result in the repo. It is retrieval-only, not final answer quality.
ANN/index curve	At 50000 generated 128-d vectors, NumPy exact keeps `recall@10 1.000` at `6.49 ms`; quantized int8 keeps `0.934` at `24.92 ms`; Annoy is faster at `4.92 ms` but drops to `0.730` recall; Qdrant local keeps `1.000` recall at `43.49 ms`.	Current local scale boundary is clear: quantized search needs kernel work, Annoy needs tuning/FAISS, and Qdrant should be tested in service mode for a fair production comparison.
Next public proof	LongMemEval / LoCoMo answer generation with a local LLM.	Retrieval is now measured. The next serious number should test answer accuracy, abstention, and faithfulness.

Real Benchmark Matrix

benchmark	what it proves	status	baseline / competitor	target
Agent user-memory retrieval	Natural-language recall over 200 user facts.	implemented	Chroma	Match Chroma `precision@1`, beat `precision@3`, stay under 5 ms at 200 memories.
Dynamic memory policy	Hot memory, TTL, corrections, stale suppression, namespace isolation.	implemented	Chroma static	Keep `precision@1` and stale suppression at 1.00, cut avg latency below 10 ms at 1000 memories.
Field memory graph dynamics	Related memories excite each other, newer conflicting memories suppress stale facts, graph energy decays, and active clusters expose concept candidates.	implemented	WaveMind static	Keep `precision@1`, stale suppression, and concept formation at 1.00 while moving from synthetic checks to LoCoMo/LongMemEval evidence.
WaveMind capacity curve	How recall and latency change at 200 / 1000 / 5000 memories.	implemented	WaveMind-only today	Keep `precision@1 >= 0.95` at 5000 memories and dynamic latency below 20 ms.
Long-term memory evidence	Evidence retrieval from long histories with profile, preference, correction, TTL, namespace, and filler noise.	implemented	Static vector / Chroma / Qdrant	Keep this as a small regression test while public LoCoMo and LongMemEval runners carry the stronger evidence claims.
BEIR-style open retrieval runner	Public `corpus.jsonl`, `queries.jsonl`, `qrels/*.tsv` datasets with the same metrics for each engine.	implemented	WaveMind / Chroma / Qdrant	Use identical embeddings and report `nDCG@k`, `Recall@k`, `MRR@k`, `precision@1`, and latency. Current checked-in run: BEIR SciFact.
ANN/VectorDBBench-style local curve	Recall/latency tradeoff for candidate indexes on generated vectors.	implemented	NumPy exact / quantized int8 / Annoy / Qdrant local	Use this as the local engineering curve; official VectorDBBench remains future work.
BEIR	Standard zero-shot information retrieval quality.	planned	Chroma / Qdrant / FAISS	Stay within 0.02 `nDCG@10` on identical embeddings.
MTEB Retrieval	Separates encoder quality from retrieval-store quality.	planned	Chroma / Qdrant / FAISS	Prove WaveMind does not reduce same-embedding retrieval quality.
MIRACL Russian	Multilingual retrieval with Russian relevance judgments.	planned	Chroma / Qdrant / FAISS	Reach same-embedding parity on Russian `nDCG@10`.
VectorDBBench	Vector database insertion/search/filter/cost-performance benchmark.	planned	Chroma / Qdrant / Milvus / Weaviate / Pinecone / FAISS	Use only after WaveMind has a production index path; today it is a memory layer, not a standalone cloud vector DB.
LoCoMo	Long conversation memory, temporal consistency, multi-hop recall. Retrieval-only runner is implemented for official `locomo10.json`.	implemented	Static vector / Chroma / Qdrant	Improve answer generation accuracy on top of the stronger sentence-transformers evidence retrieval run.
LongMemEval	Long-term assistant memory with updates and abstention.	implemented retrieval, answer runner ready	Static vector / Chroma / Qdrant / Mem0-style memory	Add LLM answer quality and abstention after retrieval.
LongMemEval-V2	Web-agent memory: state recall, dynamic state, workflow gotchas.	planned	AgentRunbook-R / Chroma RAG / Qdrant RAG	Prove WaveMind can retrieve compact evidence from agent trajectories.
LMEB	Long-horizon memory embedding tasks beyond normal passage retrieval.	planned	Embedding-only baselines / Chroma / Qdrant	Choose the default semantic encoder using memory-specific tasks.
RAGBench	Downstream RAG context and answer quality.	planned	Chroma RAG / Qdrant RAG / Pinecone RAG	Show whether stale-memory suppression improves context relevance.

The planned rows are not claimed wins. They are the public evaluation path WaveMind needs before strong production claims.

Open Retrieval Benchmarks

Many retrieval benchmarks use the same simple shape:

corpus.jsonl - documents with _id, optional title, and text.
queries.jsonl - queries with _id and text.
qrels/test.tsv - judged relevance rows: query-id, corpus-id, score.

WaveMind includes a BEIR-style runner so the same downloaded dataset can be used for WaveMind, Chroma, and Qdrant:

pip install -e ".[bench]"
python benchmarks/open_retrieval_benchmark.py --dataset ./benchmarks/data/scifact --engines wavemind chroma qdrant --top-k 10

That runner reports nDCG@k, Recall@k, MRR@k, precision@1, average latency, and p95 latency. It intentionally uses the same WaveMind encoder for all engines, so the comparison is about retrieval/index behavior rather than which embedding model each project chooses by default.

Checked-in BEIR SciFact result:

5183 documents, 300 test queries, HashingTextEncoder, top-k 10. This is a public retrieval sanity check, not the main agent-memory proof. Full machine-readable result: benchmarks/open_retrieval_scifact_results.json.

engine	nDCG@10	Recall@10	MRR@10	precision@1	avg latency	p95 latency
WaveMind	0.354	0.482	0.317	0.240	117.02 ms	256.57 ms
Chroma	0.350	0.467	0.315	0.243	1.79 ms	2.39 ms
Qdrant	0.354	0.482	0.317	0.240	17.71 ms	23.28 ms

Read this result narrowly: WaveMind preserves same-embedding retrieval quality on a real public dataset, but its current exact path is far slower than Chroma. Qdrant local preserves the same ranking quality and is much faster than the WaveMind NumPy exact path. The engineering target is a FAISS/Annoy candidate index with WaveMind's dynamic field policy applied only as a top-k re-ranker.

LoCoMo Evidence Retrieval

WaveMind now includes a retrieval-only runner for the public LoCoMo dataset. It treats LoCoMo conversation turns as memories and LoCoMo QA evidence dialog IDs as relevance labels. This measures the memory layer before any LLM answer-generation noise.

Run it on the official locomo10.json file:

mkdir -p benchmarks/data
curl -L https://raw.githubusercontent.com/snap-research/locomo/main/data/locomo10.json -o benchmarks/data/locomo10.json
python benchmarks/locomo_memory_benchmark.py --dataset benchmarks/data/locomo10.json --engines wavemind static chroma qdrant --top-k 5 --output benchmarks/locomo_evidence_results.json

Metrics reported:

evidence_recall@k - whether the labeled LoCoMo evidence turns appear in the returned memory block.
precision@1 - whether the first returned memory is labeled evidence.
MRR@k - how high the first relevant evidence turn appears.
context_budget_saved - how much smaller the returned evidence block is than the full conversation memory.
avg_latency_ms and p95_latency_ms - retrieval latency only.

If Chroma or Qdrant are not installed, use the baseline-only command:

python benchmarks/locomo_memory_benchmark.py --dataset benchmarks/data/locomo10.json --engines wavemind static --top-k 5

Namespace Sharding

For multi-tenant local deployments, ShardedWaveMind routes namespaces across multiple SQLite files:

from wavemind import ShardedWaveMind

memory = ShardedWaveMind(root_path="./state/wavemind-shards", shard_count=16)
memory.remember("Tenant A prefers short support replies.", namespace="tenant:a")
memory.remember("Tenant B tracks trading research.", namespace="tenant:b")

print(memory.query("support replies", namespace="tenant:a", top_k=3))
print(memory.stats())
memory.close()

This is namespace-level sharding for isolation and local scale. It is not a distributed HA cluster yet; the roadmap keeps replication, operator support, and managed service work separate.

Checked-in official LoCoMo retrieval result:

10 conversations, 5882 memory turns, 1977 evidence-labeled questions, HashingTextEncoder, top-k 5. Full machine-readable result: benchmarks/locomo_evidence_results.json.

engine	evidence recall@5	precision@1	MRR@5	avg latency	p95 latency
WaveMind	0.386	0.239	0.307	3.95 ms	7.44 ms
Static vector	0.263	0.133	0.189	1.94 ms	3.87 ms
Chroma static	0.257	0.129	0.185	7.03 ms	9.74 ms
Qdrant static	0.263	0.133	0.189	147.58 ms	210.23 ms

Checked-in semantic LoCoMo run:

Same official data, same engines, but with sentence-transformers/paraphrase-multilingual-mpnet-base-v2. Full machine-readable result: benchmarks/locomo_sentence_evidence_results.json.

engine	evidence recall@5	precision@1	MRR@5	avg latency	p95 latency
WaveMind	0.547	0.333	0.432	3.44 ms	5.56 ms
Static vector	0.409	0.219	0.305	1.25 ms	2.05 ms
Chroma static	0.407	0.218	0.304	4.97 ms	6.30 ms
Qdrant static	0.409	0.219	0.305	124.34 ms	149.72 ms

Read this as retrieval-only evidence quality, not full QA quality. It uses the same embeddings for every engine inside each table. The sentence-transformers run is the stronger evidence-quality number: WaveMind improves recall over static vector-store retrieval, while Chroma remains the fastest retrieval backend. The next LoCoMo step is answer generation and faithfulness with a local LLM on top of retrieved evidence.

LongMemEval Evidence Retrieval

WaveMind also includes a retrieval-only runner for the official LongMemEval format. It indexes each question's long chat history and measures whether the expected evidence sessions or turns are retrieved before answer generation.

Run the full session-level retrieval benchmark:

python benchmarks/longmemeval_memory_benchmark.py --dataset benchmarks/data/longmemeval_s_cleaned.json --engines wavemind static chroma qdrant --granularity session --top-k 5 --output benchmarks/longmemeval_evidence_results.json

Checked-in official LongMemEval-S retrieval result:

470 non-abstention questions from longmemeval_s_cleaned.json, 22419 session memories, HashingTextEncoder, top-k 5. Full machine-readable result: benchmarks/longmemeval_evidence_results.json.

engine	evidence recall@5	precision@1	MRR@5	context saved	avg latency	p95 latency
WaveMind	0.782	0.696	0.762	0.869	7.27 ms	9.14 ms
Static vector	0.520	0.355	0.464	0.890	0.08 ms	0.10 ms
Chroma static	0.518	0.355	0.464	0.890	15.96 ms	18.68 ms
Qdrant static	0.520	0.355	0.464	0.890	398.48 ms	432.88 ms

The Chroma and Qdrant baselines now use the same namespace/payload scope as WaveMind. Qdrant is run in local embedded mode; the Qdrant client warns that local mode is not recommended above 20000 points, so this latency should not be read as a service-mode Qdrant result. The next step is answer-quality evaluation with a local LLM.

Answer-generation runner:

python benchmarks/longmemeval_answer_benchmark.py --dataset benchmarks/data/longmemeval_s_cleaned.json --provider ollama --model YOUR_LOCAL_MODEL --top-k 5 --output benchmarks/longmemeval_answer_results.json

There is also an extractive smoke run that does not require a model: benchmarks/longmemeval_answer_extractive_20_results.json. It is only a runner check, not a meaningful final answer-quality benchmark.

ANN Index Curve

WaveMind includes a local ANN/VectorDBBench-style curve for candidate indexes. It generates normalized vectors, queries with noisy copies, and measures recall@10 against exact cosine neighbors.

python benchmarks/ann_index_curve_benchmark.py --sizes 1000 5000 10000 50000 --dim 128 --queries 100 --top-k 10 --engines numpy quantized annoy faiss qdrant-local --output benchmarks/ann_index_curve_results.json

Add pgvector to --engines when WAVEMIND_PGVECTOR_DSN points at a PostgreSQL database with pgvector enabled. Add qdrant-service when WAVEMIND_QDRANT_URL points at a running Qdrant service. Add faiss-persisted when WAVEMIND_FAISS_PATH points at the FAISS snapshot file to validate persisted-index startup behavior.

Production profile example:

export WAVEMIND_FAISS_PATH="./state/ann-curve.faiss"
export WAVEMIND_QDRANT_URL="http://localhost:6333"
export WAVEMIND_PGVECTOR_DSN="postgresql://user:password@localhost:5432/wavemind"
python benchmarks/ann_index_curve_benchmark.py --sizes 10000 50000 --dim 128 --queries 100 --top-k 10 --engines faiss-persisted qdrant-service pgvector --output benchmarks/production_index_profile_results.json

Checked-in 50000-vector point:

engine	recall@10	avg latency	p95 latency	build
WaveMind numpy	1.000	6.49 ms	6.41 ms	744.7 ms
WaveMind quantized	0.934	24.92 ms	37.36 ms	2088.7 ms
WaveMind annoy	0.730	4.92 ms	7.37 ms	4090.1 ms
WaveMind faiss	skipped	-	-	-
Qdrant local	1.000	43.49 ms	59.68 ms	17525.7 ms

Read this as an engineering curve, not an official VectorDBBench result. Annoy is faster than exact NumPy at 50000 vectors but loses too much recall with the current settings. The new quantized backend compresses vectors and keeps 0.934 recall@10 on this run, but the current Python/NumPy kernel is slower than exact NumPy; it is a memory-footprint baseline, not a latency win yet. FAISS persistence, service-mode Qdrant, and pgvector are now explicit benchmark profiles. If a required package, service, or environment variable is missing, the runner marks that engine as skipped instead of silently falling back to another backend.

Current Local Runs

Field memory dynamics benchmark:

13 memories, 5 conflicting-fact queries, deterministic local encoder. This benchmark isolates the MemoryFieldGraph: related memories can spread activation, newer conflicting memories inhibit stale facts, graph energy decays, and active clusters can surface concept candidates. Full machine-readable result: benchmarks/field_memory_dynamics_results.json.

engine	precision@1	precision@3	stale suppression	concept formation	decay ratio	avg latency
WaveMind graph	1.00	1.00	1.00	1.00	0.81	0.82 ms
WaveMind static	0.20	1.00	0.20	0.00	0.00	0.43 ms

Run locally from a cloned repository:

python benchmarks/field_memory_dynamics_benchmark.py

Long-term memory evidence benchmark:

200 memories, 8 evidence queries, same HashingTextEncoder embeddings. This benchmark asks a stricter agent-memory question than static retrieval: did the system return the right evidence while suppressing stale, corrected, expired, or cross-user evidence? Full machine-readable result: benchmarks/long_memory_evidence_results.json.

engine	evidence recall@5	precision@1	stale suppression	context saved	avg latency
WaveMind	1.00	1.00	1.00	0.87	6.10 ms
Static vector	1.00	0.57	0.00	0.88	0.65 ms

Run locally from a cloned repository:

python benchmarks/long_memory_evidence_benchmark.py --dataset synthetic --engines wavemind static --memories 200 --top-k 5

To compare the same normalized benchmark with Chroma or Qdrant, install the benchmark extras and include those engines:

pip install -e ".[bench]"
python benchmarks/long_memory_evidence_benchmark.py --dataset synthetic --engines wavemind chroma qdrant --memories 200 --top-k 5

Real Russian sentences from Tatoeba, 50 one-word queries, NumPy exact index.

metric	hash	sentence-transformers
precision@1	1.00	1.00
precision@3	1.00	1.00
avg query	0.49 ms	52.84 ms

Capacity check with the hash encoder:

memories	precision@1	precision@3	avg query
200	1.00	1.00	0.49 ms
1000	0.88	1.00	1.50 ms
5000	0.72	0.88	5.68 ms

Run locally from a cloned repository:

python benchmarks/ru_sentences_benchmark.py --sentences 200 --queries 50 --encoder hash --index numpy
python benchmarks/ru_sentences_benchmark.py --sentences 200 --queries 50 --encoder sentence --index numpy

Agent-memory benchmark against Chroma:

200 Russian user facts, 50 natural-language questions, same precomputed HashingTextEncoder embeddings for WaveMind and Chroma. Full machine-readable result: benchmarks/agent_memory_results.json.

This is a static retrieval benchmark. It measures baseline ranking and latency, not hotness, TTL, repeated recall, or memory aging.

engine	precision@1	precision@3	avg latency
WaveMind	0.82	0.90	2.25 ms
Chroma	0.82	0.88	0.93 ms

WaveMind-only capacity checks from the current ranking path:

scenario	memories	precision@1	precision@3	avg latency	p95 latency
static agent facts	200	0.96	0.98	4.05 ms	8.18 ms
static agent facts	1000	0.96	0.98	3.53 ms	5.20 ms
static agent facts	5000	0.94	0.98	13.71 ms	17.20 ms
dynamic memory policy	200	1.00	1.00	38.40 ms	41.14 ms
dynamic memory policy	1000	1.00	1.00	54.29 ms	72.38 ms
dynamic memory policy	5000	1.00	1.00	48.36 ms	86.13 ms

Machine-readable local capacity result: benchmarks/wavemind_capacity_results.json. These capacity checks are WaveMind-only because the local restricted environment did not have Chroma installed.

Run locally from a cloned repository:

pip install -e ".[bench]"
python benchmarks/agent_memory_benchmark.py --engines wavemind chroma --facts 200 --queries 50

Dynamic agent-memory benchmark:

200 memories, 8 checks, same precomputed HashingTextEncoder embeddings. This benchmark exercises hot memory, TTL, corrections, and namespace isolation. WaveMind applies its built-in memory policy. Chroma static is a plain vector-store baseline without application-layer TTL, delete handling, namespace filters, or recall reinforcement. Full machine-readable result: benchmarks/dynamic_memory_results.json.

engine	precision@1	precision@3	stale suppression	avg latency
WaveMind	1.00	1.00	1.00	25.26 ms
Chroma static	0.57	1.00	0.00	1.75 ms

Category success:

behavior	WaveMind	Chroma static
hot memory	1.00	0.50
TTL	1.00	0.00
correction	1.00	0.00
namespace isolation	1.00	0.00

Run locally from a cloned repository:

pip install -e ".[bench]"
python benchmarks/dynamic_memory_benchmark.py --engines wavemind chroma --memories 200

Comparison

feature	WaveMind	Chroma	Qdrant
Primary role	Dynamic memory engine	Embedding database	Production vector database
Local SQLite persistence	Yes	Yes	No, separate service/storage
HTTP API	FastAPI included	Included	Included
Audit log / metrics	SQLite audit events plus `/metrics`	App-layer only	App-layer / service metrics
Dynamic memory priority	Wave-field hotness, TTL, priority	Metadata/filter driven	Payload/filter driven
Built-in forgetting	TTL and explicit forget	Manual delete/filtering	Manual delete/filtering
Best fit	Small to medium memory streams with dynamic recall	Local RAG apps and prototypes	Large-scale vector search
Scale target today	Up to 1000 optimal on NumPy, FAISS recommended beyond 5000	Larger than WaveMind local mode	Production scale

WaveMind is not trying to replace dedicated vector databases at scale. The intended product gap is dynamic priority: frequently used memories can become hotter while old or low-priority memories fade. For static RAG over large document collections, use a mature vector database. For memory that needs persistence, scoped recall, TTL, forgetting, and reinforcement, WaveMind is designed to sit above or beside the vector index.

If you already use Chroma for local memory, see the practical migration guide: docs/CHROMA_MIGRATION.md.

Known Limitations

Optimal capacity on the current NumPy exact index is up to 1000 records.
At 5000 records, one-word precision@1 is currently 0.72 with the hash encoder; many misses are ambiguous queries where another sentence containing the same word ranks first.
For N > 5000, the NumPy exact index is still reliable but scales linearly. Annoy is faster at 50000 vectors in the local curve, but current recall is only 0.730; the quantized backend reaches 0.934 recall@10 but is slower than NumPy on the current kernel. Use FAISS or a production vector service before claiming large-scale ANN quality.
sentence-transformers/paraphrase-multilingual-mpnet-base-v2 requires about 420 MB of model files. Benchmark runners cache embeddings so retrieval latency is measured separately from model encoding latency.
The Chroma comparison currently uses shared precomputed hash embeddings to isolate retrieval/ranking behavior; semantic model comparisons should be run separately.
The BEIR SciFact run uses the hash encoder to isolate index/retrieval behavior. It is not a semantic embedding leaderboard result.
On BEIR SciFact, WaveMind and Qdrant match on hash-encoder nDCG@10, while Chroma is much faster. The next index milestone is FAISS/Annoy candidate generation plus WaveMind top-k re-ranking.
The LoCoMo results are retrieval-only evidence results, not final answer-quality scores. The sentence-transformers run is stronger than the hash run, but still needs answer generation and faithfulness checks.
In the 200-fact agent benchmark, Chroma is faster on average while WaveMind is slightly higher at precision@3.
The dynamic benchmark currently compares WaveMind against a static Chroma baseline. Chroma and Qdrant can implement similar behavior with extra application-layer metadata policy, deletes, filters, and reinforcement logic.
MemoryFieldGraph is a discrete graph over stored memories, not a continuous mathematical field. Its current build path should be optimized with incremental edge updates before large production use.
pgvector is a candidate-index backend. PostgreSQL source-of-truth storage is also available separately, but migrations, PITR docs, and service benchmark profiles still need more real deployment coverage.
The Qdrant backend is also a candidate-index backend. WaveMind rebuilds it from SQLite on load/build, so large service-mode deployments still need a measured rebuild strategy and index-health monitoring.
The persisted FAISS backend validates a snapshot against current memory ids and avoids unnecessary FAISS rebuilds when the snapshot matches. It is still a single-node flat-index path, not distributed HA.
The quantized backend is an explicit int8 candidate-index experiment. It reduces vector precision and must be benchmarked per workload before use.
The synthetic long-term memory evidence benchmark is useful for regression and product-shape proof, but public claims should lean on LoCoMo and LongMemEval instead.
The LongMemEval result is retrieval-only. It is not a full LongMemEval answer-generation leaderboard-equivalent score.
Qdrant baselines in this README use embedded local mode. Qdrant itself warns that local mode is not recommended above 20000 points; use the qdrant-service benchmark profile before making production latency claims.
MTEB, MIRACL, LMEB, official VectorDBBench, and RAGBench are listed as the public benchmark roadmap, not as completed results yet.
Ollama answer generation is implemented, but the current machine has no local Ollama model available and the local Ollama API returns 502/connection-reset. The checked-in answer file is extractive smoke only, not an LLM score.
Public benchmark adapters require optional datasets, heavier dependencies, or running services. They are intentionally outside the minimal pip install wavemind path.
Dynamic memory is slower than static Chroma in the current local benchmark: 25.26 ms vs 1.75 ms average query latency on this machine.
Current WaveMind-only dynamic checks keep precision@1 at 1.00 through 5000 memories, but average latency is around 48-54 ms. The next optimization target is field/re-ranking latency, not basic recall quality.

Roadmap

Full roadmap: docs/ROADMAP.md. Launch and positioning kit: docs/LAUNCH_KIT.md.

Near-term priorities:

Service-mode Qdrant, pgvector, and persisted-FAISS benchmark runs on a real production-like machine.
Migration tooling and operational docs for Postgres source-of-truth storage.
Tune the new quantized int8 backend so compression does not cost more latency than exact NumPy on common workloads.
Service-mode Qdrant and FAISS latency baselines using the explicit Qdrant backend, not only the standalone Qdrant benchmark baseline.
LoCoMo and LongMemEval answer-quality evaluation, not retrieval only.
Harden framework adapters: LangGraph, LlamaIndex, CrewAI, AutoGen, OpenClaw, and HTTP-only sidecar use.
Faster dynamic re-ranking through smaller candidate windows, caching, and background updates.
Better production operations: OpenTelemetry is optional and implemented; richer latency histograms, index-health metrics, alerting examples, and restore drills are next.

Longer-term direction:

scale from thousands of memories to 100k-1M on one node;
keep SQLite as the local source of truth while adding Postgres and external vector backends for production;
evolve MemoryFieldGraph from a regression-tested graph into a stronger field-memory model with excitation, inhibition, decay, and consolidation;
build enterprise features only after benchmarked retrieval, latency, and answer-quality evidence are solid.

Contributing

Contributing guide: CONTRIBUTING.md.

Useful contribution paths:

add reproducible benchmark adapters and checked-in result JSON;
improve FAISS, Qdrant, pgvector, or other candidate-index backends;
add examples for LangGraph, LlamaIndex, CrewAI, AutoGen, OpenClaw, and HTTP-only sidecar deployments;
improve dynamic memory behavior around TTL, corrections, namespaces, graph excitation/inhibition, and consolidation;
harden production operations: backups, audit logs, metrics, tracing, and migration tools.

GitHub issue templates are included for bugs, features, benchmarks, and integrations. Benchmark claims need a reproduction command and committed result artifact before they are added to README.

License

MIT. See LICENSE.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

caspian

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

2.2.1

Jul 1, 2026

2.2.0

Jul 1, 2026

2.1.1

Jul 1, 2026

2.1.0

Jun 30, 2026

2.0.5

Jun 18, 2026

2.0.4

Jun 18, 2026

2.0.3

Jun 18, 2026

2.0.2

Jun 18, 2026

2.0.1

Jun 18, 2026

2.0.0

Jun 18, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wavemind-2.2.1.tar.gz (217.3 kB view details)

Uploaded Jul 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

wavemind-2.2.1-py3-none-any.whl (71.9 kB view details)

Uploaded Jul 1, 2026 Python 3

File details

Details for the file wavemind-2.2.1.tar.gz.

File metadata

Download URL: wavemind-2.2.1.tar.gz
Upload date: Jul 1, 2026
Size: 217.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for wavemind-2.2.1.tar.gz
Algorithm	Hash digest
SHA256	`e3adfedb8ff4a9ea9053d34f022e182bed705585ee2811b01c0e5b67dc616f1c`
MD5	`1bc9f6413de539dec96126d0802e1683`
BLAKE2b-256	`bbec73cb0c5815f0caeafb4bf8c0b336979004ef71fdb1b73f31225afe892ce9`

See more details on using hashes here.

Provenance

The following attestation bundles were made for wavemind-2.2.1.tar.gz:

Publisher: publish.yml on CaspianG/wavemind

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: wavemind-2.2.1.tar.gz
- Subject digest: e3adfedb8ff4a9ea9053d34f022e182bed705585ee2811b01c0e5b67dc616f1c
- Sigstore transparency entry: 2038249989
- Sigstore integration time: Jul 1, 2026
Source repository:
- Permalink: CaspianG/wavemind@2f2f0dec06f36acaae9c31a14d4863a64b470737
- Branch / Tag: refs/tags/v2.2.1
- Owner: https://github.com/CaspianG
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@2f2f0dec06f36acaae9c31a14d4863a64b470737
- Trigger Event: push

File details

Details for the file wavemind-2.2.1-py3-none-any.whl.

File metadata

Download URL: wavemind-2.2.1-py3-none-any.whl
Upload date: Jul 1, 2026
Size: 71.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for wavemind-2.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1bd5d7f3ae6b69706838c0e89ca99b86ecb9d54f04f79eb3206195305bbdd41d`
MD5	`d7dcaddfce7088e68064841dca79657a`
BLAKE2b-256	`2f437f75c306f01ea058332aa485eaad1f4a4f8e50d99c12326d0d26121df743`

See more details on using hashes here.

Provenance

The following attestation bundles were made for wavemind-2.2.1-py3-none-any.whl:

Publisher: publish.yml on CaspianG/wavemind

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: wavemind-2.2.1-py3-none-any.whl
- Subject digest: 1bd5d7f3ae6b69706838c0e89ca99b86ecb9d54f04f79eb3206195305bbdd41d
- Sigstore transparency entry: 2038250115
- Sigstore integration time: Jul 1, 2026
Source repository:
- Permalink: CaspianG/wavemind@2f2f0dec06f36acaae9c31a14d4863a64b470737
- Branch / Tag: refs/tags/v2.2.1
- Owner: https://github.com/CaspianG
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@2f2f0dec06f36acaae9c31a14d4863a64b470737
- Trigger Event: push

wavemind 2.2.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Project description

WaveMind

What Is WaveMind?

60-Second Version

Why Use It?

Quick Start

CLI Cheat Sheet

WaveMind Studio

Python Example

HTTP Example

Where Data Lives

Common Ways To Use It

Minimal Agent Loop

Terminal Demo

How The Memory Field Works

Optional Embeddings

Optional Index Backends

Storage Backends

Backup And Restore

HTTP API

Install From Source

LangChain Memory

Framework Integrations

OpenClaw Integration

Hermes and Custom Agent Loops

Non-Agent Use Cases

Why Dynamic Memory

Benchmark

Real Benchmark Matrix

Open Retrieval Benchmarks

LoCoMo Evidence Retrieval

Namespace Sharding

LongMemEval Evidence Retrieval

ANN Index Curve

Current Local Runs

Comparison

Known Limitations

Roadmap

Contributing

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance