Skip to main content

Local-first dynamic memory field with vector search and wave-field re-ranking

Project description

WaveMind

Local-first dynamic memory for apps, agents, notebooks, and tools.

WaveMind stores memories in SQLite, finds relevant candidates with vector search, then uses a wave-field priority layer to decide what still matters: hot facts rise, stale facts fade, temporary facts expire, and namespaces keep users or projects isolated.

Python PyPI Tests License

WaveMind dynamic memory overview

Quick Start | CLI | Studio | Python Example | HTTP Example | Where Data Lives | LangChain | Chroma Migration | Use Cases | HTTP API | Benchmarks | Roadmap | Contributing | Limitations

What Is WaveMind?

WaveMind is a dynamic memory engine you can embed in a product.

Use it when your app needs to remember things like user preferences, decisions, corrections, notes, research snippets, support history, agent context, or temporary facts.

The short version:

normal vector search:  find the nearest text
WaveMind:              find the nearest useful memory

WaveMind is not trying to replace every vector database. It is the memory layer around retrieval: persistence, namespaces, TTL, hotness, priority, decay, explicit forgetting, audit events, and optional graph dynamics.

60-Second Version

Question Answer
What does it store? Text memories, vectors, metadata, tags, TTL, priority, and recall state.
Where does it store data? A local SQLite file by default; Postgres is available for production state.
How do I use it? CLI, Python API, FastAPI HTTP server, LangChain memory, or framework adapters.
What is different from Chroma/Qdrant? WaveMind adds memory policy: hotness, decay, TTL, correction handling, and scoped recall.
When should I not use it? For huge static document search where a mature vector DB is already the right tool.
What is the simplest install? python -m pip install wavemind

Why Use It?

If you need... WaveMind gives you...
Memory that survives restarts One SQLite file stores text, vectors, metadata, TTL, and recall state.
Per-user or per-project recall Namespaces and tags keep memories separated.
Temporary facts ttl_seconds lets facts expire automatically.
Corrections and changing preferences Newer or reinforced memories can outrank stale ones.
A simple integration path Python API, CLI, FastAPI server, and LangChain memory class.
Production hygiene Backups, audit log, API keys, rate limits, Prometheus metrics, and OpenTelemetry traces.

Quick Start

The shortest path from install to first recall:

python -m pip install wavemind
wavemind remember "Andrey is a trader" --namespace demo
wavemind query "What does Andrey do?" --namespace demo

Need a reminder after install?

wavemind quickstart

Want to see and manage memory in a browser?

wavemind studio

By default, WaveMind creates wavemind.sqlite3 in the current working directory. That file is the local source of truth. Keep it out of git and back it up like application state.

CLI Cheat Sheet

Start here if you only want to use WaveMind from the terminal:

Goal Command
Show first-run help wavemind quickstart
Store a memory wavemind remember "Andrey prefers short answers" --namespace user:42
Search memory wavemind query "answer style" --namespace user:42
Open local dashboard wavemind studio
See stored state wavemind stats --namespace user:42
Delete a namespace wavemind forget --namespace user:42
Import notes wavemind import ./notes.txt --namespace project:alpha
Use another database file wavemind --db ./state/memory.sqlite3 query "budget" --namespace user:42
Start the HTTP API wavemind --db ./state/memory.sqlite3 serve --host 127.0.0.1 --port 8000

After this point, choose the integration path you need: Python, HTTP, LangChain, framework adapters, benchmarks, or production deployment.

WaveMind Studio

WaveMind Studio is the built-in local dashboard. It runs on top of the same FastAPI app and SQLite database as the CLI:

wavemind studio

It opens http://127.0.0.1:8000/studio and gives you:

View What it is for
Memory map See field energy as a heatmap.
Namespace explorer Inspect memories per user, project, agent, or tenant.
Live query tester Test recall before wiring it into an app.
Feedback buttons Mark recalled memories as useful or not useful.
Import/export Import local files and export a namespace snapshot.
Backup Create SQLite backups from the browser.
Conflict visualizer Inspect correction groups when memories disagree.

For a server-safe local bind:

wavemind --db ./state/wavemind.sqlite3 studio --host 127.0.0.1 --port 8000

Python Example

from wavemind import WaveMind

memory = WaveMind(db_path="./state/wavemind.sqlite3")

memory.remember(
    "The user prefers short practical answers.",
    namespace="user:42",
    tags=["preference"],
)

hits = memory.query("How should I answer this user?", namespace="user:42", top_k=3)
for hit in hits:
    print(hit.score, hit.text)

The integration pattern is intentionally small:

  1. Call query() before your app, agent, tool, or UI needs context.
  2. Pass the returned memories into your prompt, screen, search result, or decision function.
  3. Call remember() after something worth keeping happens.

HTTP Example

The FastAPI server is included in the base install:

wavemind --db ./state/wavemind.sqlite3 serve --host 127.0.0.1 --port 8000

Then use WaveMind from any language:

curl -X POST http://127.0.0.1:8000/remember \
  -H "Content-Type: application/json" \
  -d "{\"text\":\"Andrey prefers short answers\",\"namespace\":\"user:42\",\"tags\":[\"preference\"]}"

curl -X POST http://127.0.0.1:8000/query \
  -H "Content-Type: application/json" \
  -d "{\"query\":\"How should I answer?\",\"namespace\":\"user:42\",\"top_k\":3}"

Where Data Lives

WaveMind is local-first. The SQLite database stores memories, vectors, metadata, namespaces, tags, TTL, hotness, priority, and audit events.

runtime Suggested database path
quick CLI experiment ./wavemind.sqlite3
Python app or agent ./state/wavemind.sqlite3
desktop app user data directory, for example %APPDATA% or ~/.local/share
server daemon /var/lib/wavemind/wavemind.sqlite3
Docker mounted volume, for example /data/wavemind.sqlite3

Explicit path:

wavemind --db ./state/app_memory.sqlite3 remember "Andrey prefers short answers" --namespace user:42
wavemind --db ./state/app_memory.sqlite3 query "answer style" --namespace user:42

Common Ways To Use It

You are building... Start with...
Python app from wavemind import WaveMind
LangChain agent WaveMindMemory from wavemind.integrations.langchain
LangGraph workflow make_recall_node() and make_persist_node()
LlamaIndex pipeline WaveMindRetriever
CrewAI or AutoGen loop The adapters in wavemind.integrations
Node, Go, Ruby, PHP, or no-code app wavemind serve and the HTTP API
Personal knowledge base Store notes by project namespace and query locally
Support or CRM workflow Store customer issues, resolutions, preferences, and corrections
Research or trading notebook Store observations with source metadata and TTL for temporary hypotheses

For migrations from existing local vector memory, start with docs/CHROMA_MIGRATION.md.

Minimal Agent Loop

from wavemind import WaveMind

memory = WaveMind(db_path="./state/agent.sqlite3")

def run_turn(user_id: str, user_text: str) -> str:
    namespace = f"user:{user_id}"
    hits = memory.query(user_text, namespace=namespace, top_k=5, min_score=0.25)
    recalled = "\n".join(f"- {hit.text}" for hit in hits)

    answer = call_your_llm(f"Relevant memory:\n{recalled}\n\nUser: {user_text}")

    memory.remember(f"User said: {user_text}", namespace=namespace, tags=["conversation"])
    memory.remember(f"Assistant answered: {answer}", namespace=namespace, tags=["conversation"])
    return answer

Terminal Demo

From a cloned repository:

$ python examples/demo.py
[ok] Remembered: "Andrey is a trader who tracks market breakouts."
[ok] Remembered: "Andrey prefers short practical answers about product decisions."

Query: "Andrey trader preferences"
-> Result 1 (0.60): "Andrey is a trader who tracks market breakouts."
-> Result 2 (0.30): "Andrey prefers short practical answers about product decisions."

The demo is offline, keyless, and uses the built-in hash encoder.

To see the behavior that plain vector search does not provide:

python examples/dynamic_memory_demo.py

That demo shows corrected facts outranking stale facts, temporary memory expiring, namespace isolation, and index-health reporting.

How The Memory Field Works

flowchart LR
    A["Text, event, note, document, or agent turn"] --> S["remember()"]
    S --> D[("SQLite: text + metadata + vectors + memory state")]
    Q["query()"] --> K["k-NN candidate search"]
    D --> K
    K --> W["wave-field re-rank"]
    W --> R["small ranked recall set"]
    R --> P["app, search UI, prompt, API, or tool"]
    P --> F["recall feedback updates hotness / priority"]
    F --> D

The wave field is the dynamic layer around stored memories. It is not a replacement for embeddings; it is the policy that decides which candidate memories should still matter.

signal Plain meaning Effect
vector similarity This text is semantically close to the query. Gets into the candidate set.
hotness This memory has been useful before. Moves upward during recall.
decay This memory has not mattered recently. Slowly loses influence.
priority The app says this fact is important. Raises ranking even before repetition.
TTL This fact is temporary. Drops out after expiry.
namespace and tags This belongs to one user/project/type. Prevents cross-user or cross-topic leakage.
graph dynamics Related memories can excite or inhibit each other. Helps clusters and corrections behave like memory, not a flat list.

Technically, the current MemoryFieldGraph is a discrete graph over stored memories, not a continuous mathematical physics field. That honesty matters: WaveMind is useful today as a dynamic memory engine, while the research path is to make the field dynamics more explicit, measurable, and scalable.

Optional Embeddings

For sentence-transformer embeddings:

python -m pip install "wavemind[sentence]"
wavemind --encoder sentence remember "Andrey is a trader" --namespace demo
wavemind --encoder sentence query "What does Andrey do?" --namespace demo

Optional Index Backends

The default index is NumPy exact search. It is simple and reliable for local memory. For larger candidate generation, WaveMind also exposes optional index backends:

index Install Notes
numpy default Exact cosine search, local, linear scan.
quantized default Local int8-compressed candidate index. Useful for memory-footprint experiments; current kernel is approximate and not yet faster than NumPy.
annoy pip install "wavemind[indexes]" Local ANN. Faster at larger N, but recall must be checked.
faiss pip install "wavemind[indexes]" FAISS flat inner-product path where faiss-cpu is available.
faiss-persisted pip install "wavemind[indexes]" FAISS with an explicit persisted index snapshot and id map.
pgvector pip install "wavemind[postgres]" PostgreSQL/pgvector candidate index. SQLite can still remain the local source of truth.
qdrant pip install "wavemind[indexes]" Qdrant service/local-mode candidate index. SQLite remains the source of truth; Qdrant stores vectors.

Persisted FAISS setup:

export WAVEMIND_FAISS_PATH="./state/wavemind.faiss"
wavemind --index faiss-persisted remember "Andrey is a trader" --namespace demo
wavemind --index faiss-persisted query "trader" --namespace demo

SQLite or Postgres remains the source of truth. The persisted FAISS files are a candidate-index snapshot and are validated against the current memory ids on load. If the snapshot does not match the stored memories, WaveMind rebuilds it. You can also check and rebuild the candidate index explicitly:

wavemind --index faiss-persisted index-health --json
wavemind --index faiss-persisted rebuild-index

Index health compares durable memory ids against the candidate index. Local indexes report exact missing/extra ids; service backends report exact ids when the backend exposes an id scan and otherwise fall back to count-based health.

pgvector setup:

export WAVEMIND_PGVECTOR_DSN="postgresql://user:password@localhost:5432/wavemind"
wavemind --index pgvector remember "Andrey is a trader" --namespace demo
wavemind --index pgvector query "trader" --namespace demo

Optional pgvector environment variables:

  • WAVEMIND_PGVECTOR_TABLE - table name, default wavemind_vectors.
  • WAVEMIND_PGVECTOR_COLLECTION - collection key, default default.
  • WAVEMIND_PGVECTOR_CREATE_HNSW=1 - create an HNSW index using vector_cosine_ops when the installed pgvector version supports it.

If WAVEMIND_PGVECTOR_DSN is missing, WaveMind raises a clear error instead of silently falling back to another index backend. The pgvector table is created with the current encoder dimension, so use a separate table when switching between different vector sizes.

Qdrant setup:

export WAVEMIND_QDRANT_URL="http://localhost:6333"
export WAVEMIND_QDRANT_COLLECTION="wavemind_vectors"
wavemind --index qdrant remember "Andrey is a trader" --namespace demo
wavemind --index qdrant query "trader" --namespace demo

For local experiments you can set WAVEMIND_QDRANT_URL=":memory:", but production latency and durability should be measured against a real Qdrant service. If WAVEMIND_QDRANT_URL is missing, WaveMind raises a clear error instead of silently falling back to another backend.

Storage Backends

SQLite is the default source of truth. For multi-tenant production deployments, WaveMind also exposes PostgreSQL as an explicit source-of-truth backend:

export WAVEMIND_STORE="postgres"
export WAVEMIND_POSTGRES_DSN="postgresql://user:password@localhost:5432/wavemind"
wavemind --store postgres remember "Andrey is a trader" --namespace user:andrey
wavemind --store postgres query "trader" --namespace user:andrey

Optional table environment variables:

  • WAVEMIND_POSTGRES_MEMORIES_TABLE, default wavemind_memories.
  • WAVEMIND_POSTGRES_AUDIT_TABLE, default wavemind_audit_events.

Postgres storage is separate from pgvector: Postgres storage keeps memories, metadata, TTL, audit events, and vectors as durable application state; pgvector is a candidate index backend for nearest-neighbor search. You can use SQLite storage with pgvector, Postgres storage with NumPy/FAISS/Qdrant, or eventually Postgres storage plus pgvector when you want both state and vector search inside PostgreSQL.

Backup And Restore

Exact one-file backup:

wavemind --db ./state/wavemind.sqlite3 backup --out ./backups/wavemind.sqlite3

Timestamped backups with retention:

wavemind --db ./state/wavemind.sqlite3 backup --out ./backups --prefix wavemind --keep-last 7

Restore into a new or replacement SQLite file:

wavemind restore --from ./backups/wavemind-20260630-120000.sqlite3 --to ./state/wavemind.sqlite3 --overwrite

The backup command uses SQLite's backup API, so it is safe to run while the process is alive. Restore is intentionally an explicit command and refuses to overwrite an existing database unless --overwrite is passed. For Postgres storage, use database-native backup tooling such as pg_dump, managed snapshots, or point-in-time recovery instead of WaveMind's SQLite file backup command.

HTTP API

Run the local FastAPI server:

wavemind --db ./app_memory.sqlite3 serve --host 127.0.0.1 --port 8000

Store and query memory over HTTP:

curl -X POST http://127.0.0.1:8000/remember -H "Content-Type: application/json" -d "{\"text\":\"Andrey is a trader\",\"namespace\":\"demo\"}"
curl -X POST http://127.0.0.1:8000/query -H "Content-Type: application/json" -d "{\"query\":\"trader\",\"namespace\":\"demo\",\"top_k\":1}"

Operational endpoints:

curl http://127.0.0.1:8000/stats?namespace=demo
curl http://127.0.0.1:8000/audit?namespace=demo
curl http://127.0.0.1:8000/metrics
curl http://127.0.0.1:8000/observability
curl http://127.0.0.1:8000/index/health
curl -X POST http://127.0.0.1:8000/index/rebuild
curl -X POST http://127.0.0.1:8000/backup -H "Content-Type: application/json" -d '{"path":"./backups","keep_last":7}'

/audit returns mutation events such as remember, forget, backup, and purge_expired. Query audit is opt-in with WAVEMIND_AUDIT_QUERIES=1 because writing an audit row for every query changes latency. /metrics returns a Prometheus-compatible text payload without adding a required dependency. /index/health reports source-of-truth versus candidate-index consistency. /index/rebuild rebuilds the candidate index from stored active memories and logs an index_rebuild audit event.

OpenTelemetry traces are optional and off by default:

pip install "wavemind[otel]"
export WAVEMIND_OTEL_ENABLED=1
export WAVEMIND_OTEL_SERVICE_NAME=wavemind-api
export WAVEMIND_OTEL_EXPORTER=otlp
export WAVEMIND_OTEL_ENDPOINT="http://localhost:4318/v1/traces"
wavemind --db ./app_memory.sqlite3 serve --host 127.0.0.1 --port 8000

Use WAVEMIND_OTEL_EXPORTER=console for local trace inspection. FastAPI requests are instrumented, and core memory phases such as encode, index search, graph propagation, reranking, load, and backup create spans when OpenTelemetry is enabled.

Production API controls are opt-in:

export WAVEMIND_READ_KEYS="read-key"
export WAVEMIND_WRITE_KEYS="write-key"
export WAVEMIND_ADMIN_KEYS="admin-key"
export WAVEMIND_RATE_LIMIT_PER_MINUTE=120

Role behavior:

role Env var Allows
read WAVEMIND_READ_KEYS /query, /stats, /metrics, /index/health
write WAVEMIND_WRITE_KEYS read actions plus /remember and /import
admin WAVEMIND_ADMIN_KEYS or WAVEMIND_API_KEYS all actions, including /audit, /backup, /index/rebuild, and /forget

Keys are accepted through Authorization: Bearer <key> or X-API-Key: <key>. If no key env vars are set, authentication is disabled for local development.

Install From Source

For contributors installing from a local clone:

git clone https://github.com/CaspianG/wavemind.git
cd wavemind
python -m pip install -e ".[sentence]"

One-file setup scripts are also included in the repository:

sh install.sh
install.bat

LangChain Memory

Install the optional integration:

pip install "wavemind[langchain]"

Use WaveMind as a drop-in LangChain memory object:

from wavemind.integrations.langchain import WaveMindMemory

memory = WaveMindMemory(db_path="agent_memory.sqlite3")
# Replace: memory = ConversationBufferMemory()

Offline runnable example from a cloned repository:

python examples/langchain_memory.py

Framework Integrations

WaveMind only needs two touch points in an agent, service, notebook, or app:

  1. Before work happens, query() for relevant memory and pass the short result into the next step: a prompt, search screen, tool call, support workflow, or decision function.
  2. After work happens, remember() durable facts, preferences, summaries, outcomes, corrections, or notes.

That makes it usable in more than LangChain:

Use case Integration style
LangChain agent Use WaveMindMemory from wavemind.integrations.langchain.
LangGraph workflow Use make_recall_node() and make_persist_node() from wavemind.integrations.langgraph.
LlamaIndex pipeline Use WaveMindRetriever from wavemind.integrations.llamaindex.
CrewAI crew Use WaveMindCrewAITools from wavemind.integrations.crewai.
AutoGen loop Use WaveMindAutoGenMemory from wavemind.integrations.autogen.
Custom Python agent Create one WaveMind instance and call query() before the LLM.
Node, Go, Ruby, PHP, or no-code app Run wavemind serve and call the HTTP API.
Multi-user SaaS Use namespace="user:<id>" or namespace="tenant:<id>:agent:<id>".
Knowledge base or notebook Store notes by project namespace and retrieve a small evidence set.
Support or CRM workflow Store issues, preferences, resolutions, and corrections with tags.
Research workflow Store observations with source metadata and expire temporary hypotheses.
Temporary context Store with ttl_seconds=... so stale memory expires automatically.
Preference/profile memory Store with tags such as profile, preference, project, decision.
Corrections/privacy Use forget() or namespace deletion workflows.

More examples: docs/USE_CASES.md. Migrating from a Chroma memory store: docs/CHROMA_MIGRATION.md.

Framework examples in this repository:

Framework / pattern Example
LangChain memory examples/langchain_memory.py
OpenAI/OpenRouter-style agent loop examples/agent_with_memory.py
LangGraph hooks wavemind.integrations.langgraph, examples/framework_integrations.py
LlamaIndex-style retriever wavemind.integrations.llamaindex, examples/framework_integrations.py
CrewAI-style tools wavemind.integrations.crewai, examples/framework_integrations.py
AutoGen-style hooks wavemind.integrations.autogen, examples/framework_integrations.py
Namespace sharding examples/sharded_memory.py

OpenClaw Integration

OpenClaw memory is file-centered: it writes durable memory into MEMORY.md, daily notes under memory/, and uses tools such as memory_search / memory_get. OpenClaw's documented agent loop also exposes hooks such as before_prompt_build, agent_end, message_received, and message_sent.

The safest WaveMind integration is a sidecar, not a replacement:

  • Keep OpenClaw's Markdown memory as the human-readable source of durable truth.
  • Use WaveMind as the dynamic recall layer for hotness, TTL, namespaces, and correction-sensitive ranking.
  • Store the SQLite file outside committed workspace files, for example ~/.openclaw/wavemind/<agent-id>.sqlite3.
  • Query WaveMind from before_prompt_build and inject a compact memory block with prependContext.
  • Capture new durable summaries from agent_end or message hooks.

Sketch of the adapter logic:

from pathlib import Path
from wavemind import WaveMind

db_path = Path.home() / ".openclaw" / "wavemind" / "main.sqlite3"
memory = WaveMind(db_path=db_path)

def before_prompt_build(agent_id: str, user_text: str) -> str:
    namespace = f"openclaw:{agent_id}"
    hits = memory.query(user_text, namespace=namespace, top_k=5, min_score=0.25)
    return "\n".join(f"- {hit.text}" for hit in hits)

def agent_end(agent_id: str, summary: str) -> None:
    namespace = f"openclaw:{agent_id}"
    memory.remember(summary, namespace=namespace, tags=["summary"], priority=1.5)

For a production OpenClaw plugin, translate that sketch into the documented plugin hook surface: before_prompt_build for recall and agent_end / message_received / message_sent for capture.

Hermes and Custom Agent Loops

The public HERMES Agent is a LangChain / LangGraph mathematical-reasoning agent. Its README describes HermesReasoner as a LangChain BaseTool and mentions an optional in-memory embedding store for previously verified claims.

WaveMind fits there as a persistent memory layer around that loop:

  • Recall previously verified claims before HermesReasoner is invoked.
  • Store successfully verified claims with tags=["verified-claim"].
  • Scope by user_id, project, benchmark, or theorem namespace.
  • Replace short-lived in-memory vector recall when the agent needs restarts, TTL, explicit forgetting, or cross-session reuse.

Generic Hermes-style loop:

from wavemind import WaveMind

memory = WaveMind(db_path="./state/hermes_claims.sqlite3")

def verify_with_memory(user_id: str, problem: str) -> str:
    namespace = f"hermes:{user_id}"
    claims = memory.query(problem, namespace=namespace, tags=["verified-claim"], top_k=5)
    context = "\n".join(f"- {claim.text}" for claim in claims)

    result = call_hermes_reasoner(problem=problem, extra_context=context)

    if result.label == "CORRECT":
        memory.remember(result.claim, namespace=namespace, tags=["verified-claim"], priority=2.0)
    return result.text

For any other agent framework, the rule is the same: recall before the model, capture after the turn, isolate users with namespaces, and use TTL for temporary facts.

Non-Agent Use Cases

WaveMind can store any small-to-medium memory stream where meaning, freshness, and repeated use matter. It is useful when "show me the nearest text" is not enough and the application needs "show me what is relevant now."

Use case Example
Support memory Recall past user issues, plans, bugs, and resolutions.
Product research Store interview snippets with tags=["customer", "pain"].
Team knowledge Remember project decisions and suppress expired decisions with TTL.
Personal assistant Store preferences, routines, people, and recurring context.
Game/NPC memory Give characters scoped memory that strengthens after repeated events.
Trading research Store labeled OHLCV pattern notes before building a backtest layer.
Document notebook Import text/PDF/JSON chunks and query by namespace/project.
Personal knowledge base Keep decisions, recurring context, people, links, and notes searchable without sending them to a hosted vector DB.

Why Dynamic Memory

WaveMind is not positioned as "a faster Chroma." Chroma, Qdrant, Pinecone, and Weaviate are vector databases: they store embeddings and return nearest neighbors. That is the right tool for many static RAG workloads.

WaveMind is a dynamic memory layer. It still uses vector search first, but then applies memory-specific signals that a plain vector store does not model by default:

memory behavior Why it matters WaveMind mechanism
Hot memories Information that keeps being useful should become easier to recall again. Wave-field hotness and priority updates.
Aging memories Old low-value facts should fade instead of competing forever. TTL and decay-aware scoring.
Scoped memory One user, app, workspace, or project should not leak into another. Namespaces and tags.
Explicit forgetting Real systems need deletion, privacy cleanup, and correction workflows. forget() plus SQLite persistence.
Stable restart behavior A memory system must survive process restarts. SQLite source of truth, reloadable indexes.
Vector plus memory rank Semantic similarity is necessary but not sufficient for long-running memory. k-NN candidates first, wave field as re-ranker.

The current Chroma benchmark below is intentionally conservative: it compares static retrieval on the same facts and the same hash embeddings. That benchmark is useful, but it does not exercise WaveMind's main thesis: memory that changes over time as software recalls, reinforces, ages, and forgets information.

The benchmark that should decide whether WaveMind is worth using is a dynamic memory benchmark:

scenario What should happen
A fact, preference, or decision is used many times. WaveMind should rank it higher than equally similar but unused facts.
A fact expires via TTL. WaveMind should suppress it without requiring manual vector cleanup.
A user or system corrects an old fact. WaveMind should prefer the newer or reinforced memory.
A query is ambiguous across namespaces. WaveMind should return only the scoped user's memory.
A long history has many irrelevant facts. WaveMind should preserve useful recall instead of treating all vectors equally.

In short: static vector search answers "what is nearest?" Dynamic memory also asks "what is still relevant, reinforced, scoped, and allowed to be remembered?"

Benchmark

WaveMind tracks benchmarks in two layers:

  • Implemented local checks - fast, reproducible scripts that run from this repository and protect the core memory behavior.
  • Public benchmark roadmap - external retrieval and memory benchmarks that should decide whether WaveMind is competitive outside hand-made demos.

Machine-readable benchmark matrix: benchmarks/benchmark_matrix_results.json. Full generated benchmark report: benchmarks/BENCHMARK_REPORT.md.

Visual summary generated from the checked-in JSON results:

WaveMind benchmark summary

Regenerate the matrix and chart locally:

python benchmarks/benchmark_registry.py --output benchmarks/benchmark_matrix_results.json
python benchmarks/render_benchmark_charts.py --output docs/assets/benchmark-summary.svg

The chart shows completed local measurements plus the public benchmark roadmap. Planned public benchmarks stay out of the results section until the dataset, engine, and result JSON are committed.

Status legend:

  • implemented - script and checked-in result exist.
  • runner ready - adapter exists, but the official public dataset result is not checked in yet.
  • planned - benchmark is part of the public proof path, but no WaveMind result is claimed.

How to read the benchmark classes:

class Popular examples What it answers for WaveMind
Retrieval / embeddings BEIR, MTEB Retrieval, MIRACL Does WaveMind preserve normal vector-search quality on public qrels?
Vector index / database ANN-Benchmarks, VectorDBBench Is the candidate index fast enough at scale?
Agent memory LoCoMo, LongMemEval, LongMemEval-V2, LMEB Does WaveMind retrieve the right evolving memory across long histories?
RAG quality RAGBench Does dynamic memory improve final context and answer quality?

Current read:

area result honest interpretation
Public agent-memory evidence On official LoCoMo locomo10.json, WaveMind reaches evidence_recall@5 0.386 with hash embeddings and 0.547 with sentence-transformers. Fair namespace-filtered Chroma reaches 0.257 / 0.407; Qdrant reaches 0.263 / 0.409. WaveMind retrieves more labeled evidence. Chroma is still the fastest static vector-store baseline. Qdrant local payload filtering is much slower than service-mode Qdrant should be.
Public retrieval sanity check On BEIR SciFact, WaveMind reaches nDCG@10 0.354, Recall@10 0.482; Qdrant matches that quality; Chroma reaches 0.350 / 0.467 with identical hash embeddings. Same-embedding retrieval quality is close. Chroma is fastest at 1.79 ms; Qdrant local is 17.71 ms; WaveMind exact path is 117.02 ms.
Static agent recall WaveMind precision@1 equals Chroma at 0.82; WaveMind precision@3 is 0.90 vs Chroma 0.88. Competitive quality, but Chroma is faster on the static vector-store path.
Dynamic memory policy WaveMind reaches 1.00 stale suppression; Chroma static is 0.00. This is the strongest current differentiation: hotness, TTL, corrections, and namespaces.
Field memory dynamics Graph-enabled WaveMind reaches 1.00 precision@1, 1.00 stale suppression, and 1.00 concept formation vs static WaveMind at 0.20 / 0.20 / 0.00. This is still synthetic, but it is the first regression check for memory-to-memory excitation, conflict inhibition, and decay.
Long-term evidence WaveMind reaches 1.00 evidence recall@5, 1.00 precision@1, and 1.00 stale suppression on the synthetic long-memory evidence benchmark. This is the first proof-shaped benchmark for agent memory: it measures whether stale/corrected/expired/cross-user facts stay out of retrieved evidence.
Capacity Static precision@1 is 0.94 at 5000 memories; dynamic policy keeps 1.00 on the current checks. Quality is holding on these checks, but dynamic latency must be optimized.
LongMemEval full retrieval On the official LongMemEval-S cleaned file, 470 non-abstention session-level questions, WaveMind reaches evidence_recall@5 0.782 and precision@1 0.696; Chroma static reaches 0.518 / 0.355; Qdrant static reaches 0.520 / 0.355. This is now the strongest public memory result in the repo. It is retrieval-only, not final answer quality.
ANN/index curve At 50000 generated 128-d vectors, NumPy exact keeps recall@10 1.000 at 6.49 ms; quantized int8 keeps 0.934 at 24.92 ms; Annoy is faster at 4.92 ms but drops to 0.730 recall; Qdrant local keeps 1.000 recall at 43.49 ms. Current local scale boundary is clear: quantized search needs kernel work, Annoy needs tuning/FAISS, and Qdrant should be tested in service mode for a fair production comparison.
Next public proof LongMemEval / LoCoMo answer generation with a local LLM. Retrieval is now measured. The next serious number should test answer accuracy, abstention, and faithfulness.

Real Benchmark Matrix

benchmark what it proves status baseline / competitor target
Agent user-memory retrieval Natural-language recall over 200 user facts. implemented Chroma Match Chroma precision@1, beat precision@3, stay under 5 ms at 200 memories.
Dynamic memory policy Hot memory, TTL, corrections, stale suppression, namespace isolation. implemented Chroma static Keep precision@1 and stale suppression at 1.00, cut avg latency below 10 ms at 1000 memories.
Field memory graph dynamics Related memories excite each other, newer conflicting memories suppress stale facts, graph energy decays, and active clusters expose concept candidates. implemented WaveMind static Keep precision@1, stale suppression, and concept formation at 1.00 while moving from synthetic checks to LoCoMo/LongMemEval evidence.
WaveMind capacity curve How recall and latency change at 200 / 1000 / 5000 memories. implemented WaveMind-only today Keep precision@1 >= 0.95 at 5000 memories and dynamic latency below 20 ms.
Long-term memory evidence Evidence retrieval from long histories with profile, preference, correction, TTL, namespace, and filler noise. implemented Static vector / Chroma / Qdrant Keep this as a small regression test while public LoCoMo and LongMemEval runners carry the stronger evidence claims.
BEIR-style open retrieval runner Public corpus.jsonl, queries.jsonl, qrels/*.tsv datasets with the same metrics for each engine. implemented WaveMind / Chroma / Qdrant Use identical embeddings and report nDCG@k, Recall@k, MRR@k, precision@1, and latency. Current checked-in run: BEIR SciFact.
ANN/VectorDBBench-style local curve Recall/latency tradeoff for candidate indexes on generated vectors. implemented NumPy exact / quantized int8 / Annoy / Qdrant local Use this as the local engineering curve; official VectorDBBench remains future work.
BEIR Standard zero-shot information retrieval quality. planned Chroma / Qdrant / FAISS Stay within 0.02 nDCG@10 on identical embeddings.
MTEB Retrieval Separates encoder quality from retrieval-store quality. planned Chroma / Qdrant / FAISS Prove WaveMind does not reduce same-embedding retrieval quality.
MIRACL Russian Multilingual retrieval with Russian relevance judgments. planned Chroma / Qdrant / FAISS Reach same-embedding parity on Russian nDCG@10.
VectorDBBench Vector database insertion/search/filter/cost-performance benchmark. planned Chroma / Qdrant / Milvus / Weaviate / Pinecone / FAISS Use only after WaveMind has a production index path; today it is a memory layer, not a standalone cloud vector DB.
LoCoMo Long conversation memory, temporal consistency, multi-hop recall. Retrieval-only runner is implemented for official locomo10.json. implemented Static vector / Chroma / Qdrant Improve answer generation accuracy on top of the stronger sentence-transformers evidence retrieval run.
LongMemEval Long-term assistant memory with updates and abstention. implemented retrieval, answer runner ready Static vector / Chroma / Qdrant / Mem0-style memory Add LLM answer quality and abstention after retrieval.
LongMemEval-V2 Web-agent memory: state recall, dynamic state, workflow gotchas. planned AgentRunbook-R / Chroma RAG / Qdrant RAG Prove WaveMind can retrieve compact evidence from agent trajectories.
LMEB Long-horizon memory embedding tasks beyond normal passage retrieval. planned Embedding-only baselines / Chroma / Qdrant Choose the default semantic encoder using memory-specific tasks.
RAGBench Downstream RAG context and answer quality. planned Chroma RAG / Qdrant RAG / Pinecone RAG Show whether stale-memory suppression improves context relevance.

The planned rows are not claimed wins. They are the public evaluation path WaveMind needs before strong production claims.

Open Retrieval Benchmarks

Many retrieval benchmarks use the same simple shape:

  • corpus.jsonl - documents with _id, optional title, and text.
  • queries.jsonl - queries with _id and text.
  • qrels/test.tsv - judged relevance rows: query-id, corpus-id, score.

WaveMind includes a BEIR-style runner so the same downloaded dataset can be used for WaveMind, Chroma, and Qdrant:

pip install -e ".[bench]"
python benchmarks/open_retrieval_benchmark.py --dataset ./benchmarks/data/scifact --engines wavemind chroma qdrant --top-k 10

That runner reports nDCG@k, Recall@k, MRR@k, precision@1, average latency, and p95 latency. It intentionally uses the same WaveMind encoder for all engines, so the comparison is about retrieval/index behavior rather than which embedding model each project chooses by default.

Checked-in BEIR SciFact result:

5183 documents, 300 test queries, HashingTextEncoder, top-k 10. This is a public retrieval sanity check, not the main agent-memory proof. Full machine-readable result: benchmarks/open_retrieval_scifact_results.json.

engine nDCG@10 Recall@10 MRR@10 precision@1 avg latency p95 latency
WaveMind 0.354 0.482 0.317 0.240 117.02 ms 256.57 ms
Chroma 0.350 0.467 0.315 0.243 1.79 ms 2.39 ms
Qdrant 0.354 0.482 0.317 0.240 17.71 ms 23.28 ms

Read this result narrowly: WaveMind preserves same-embedding retrieval quality on a real public dataset, but its current exact path is far slower than Chroma. Qdrant local preserves the same ranking quality and is much faster than the WaveMind NumPy exact path. The engineering target is a FAISS/Annoy candidate index with WaveMind's dynamic field policy applied only as a top-k re-ranker.

LoCoMo Evidence Retrieval

WaveMind now includes a retrieval-only runner for the public LoCoMo dataset. It treats LoCoMo conversation turns as memories and LoCoMo QA evidence dialog IDs as relevance labels. This measures the memory layer before any LLM answer-generation noise.

Run it on the official locomo10.json file:

mkdir -p benchmarks/data
curl -L https://raw.githubusercontent.com/snap-research/locomo/main/data/locomo10.json -o benchmarks/data/locomo10.json
python benchmarks/locomo_memory_benchmark.py --dataset benchmarks/data/locomo10.json --engines wavemind static chroma qdrant --top-k 5 --output benchmarks/locomo_evidence_results.json

Metrics reported:

  • evidence_recall@k - whether the labeled LoCoMo evidence turns appear in the returned memory block.
  • precision@1 - whether the first returned memory is labeled evidence.
  • MRR@k - how high the first relevant evidence turn appears.
  • context_budget_saved - how much smaller the returned evidence block is than the full conversation memory.
  • avg_latency_ms and p95_latency_ms - retrieval latency only.

If Chroma or Qdrant are not installed, use the baseline-only command:

python benchmarks/locomo_memory_benchmark.py --dataset benchmarks/data/locomo10.json --engines wavemind static --top-k 5

Namespace Sharding

For multi-tenant local deployments, ShardedWaveMind routes namespaces across multiple SQLite files:

from wavemind import ShardedWaveMind

memory = ShardedWaveMind(root_path="./state/wavemind-shards", shard_count=16)
memory.remember("Tenant A prefers short support replies.", namespace="tenant:a")
memory.remember("Tenant B tracks trading research.", namespace="tenant:b")

print(memory.query("support replies", namespace="tenant:a", top_k=3))
print(memory.stats())
memory.close()

This is namespace-level sharding for isolation and local scale. It is not a distributed HA cluster yet; the roadmap keeps replication, operator support, and managed service work separate.

Checked-in official LoCoMo retrieval result:

10 conversations, 5882 memory turns, 1977 evidence-labeled questions, HashingTextEncoder, top-k 5. Full machine-readable result: benchmarks/locomo_evidence_results.json.

engine evidence recall@5 precision@1 MRR@5 avg latency p95 latency
WaveMind 0.386 0.239 0.307 3.95 ms 7.44 ms
Static vector 0.263 0.133 0.189 1.94 ms 3.87 ms
Chroma static 0.257 0.129 0.185 7.03 ms 9.74 ms
Qdrant static 0.263 0.133 0.189 147.58 ms 210.23 ms

Checked-in semantic LoCoMo run:

Same official data, same engines, but with sentence-transformers/paraphrase-multilingual-mpnet-base-v2. Full machine-readable result: benchmarks/locomo_sentence_evidence_results.json.

engine evidence recall@5 precision@1 MRR@5 avg latency p95 latency
WaveMind 0.547 0.333 0.432 3.44 ms 5.56 ms
Static vector 0.409 0.219 0.305 1.25 ms 2.05 ms
Chroma static 0.407 0.218 0.304 4.97 ms 6.30 ms
Qdrant static 0.409 0.219 0.305 124.34 ms 149.72 ms

Read this as retrieval-only evidence quality, not full QA quality. It uses the same embeddings for every engine inside each table. The sentence-transformers run is the stronger evidence-quality number: WaveMind improves recall over static vector-store retrieval, while Chroma remains the fastest retrieval backend. The next LoCoMo step is answer generation and faithfulness with a local LLM on top of retrieved evidence.

LongMemEval Evidence Retrieval

WaveMind also includes a retrieval-only runner for the official LongMemEval format. It indexes each question's long chat history and measures whether the expected evidence sessions or turns are retrieved before answer generation.

Run the full session-level retrieval benchmark:

python benchmarks/longmemeval_memory_benchmark.py --dataset benchmarks/data/longmemeval_s_cleaned.json --engines wavemind static chroma qdrant --granularity session --top-k 5 --output benchmarks/longmemeval_evidence_results.json

Checked-in official LongMemEval-S retrieval result:

470 non-abstention questions from longmemeval_s_cleaned.json, 22419 session memories, HashingTextEncoder, top-k 5. Full machine-readable result: benchmarks/longmemeval_evidence_results.json.

engine evidence recall@5 precision@1 MRR@5 context saved avg latency p95 latency
WaveMind 0.782 0.696 0.762 0.869 7.27 ms 9.14 ms
Static vector 0.520 0.355 0.464 0.890 0.08 ms 0.10 ms
Chroma static 0.518 0.355 0.464 0.890 15.96 ms 18.68 ms
Qdrant static 0.520 0.355 0.464 0.890 398.48 ms 432.88 ms

The Chroma and Qdrant baselines now use the same namespace/payload scope as WaveMind. Qdrant is run in local embedded mode; the Qdrant client warns that local mode is not recommended above 20000 points, so this latency should not be read as a service-mode Qdrant result. The next step is answer-quality evaluation with a local LLM.

Answer-generation runner:

python benchmarks/longmemeval_answer_benchmark.py --dataset benchmarks/data/longmemeval_s_cleaned.json --provider ollama --model YOUR_LOCAL_MODEL --top-k 5 --output benchmarks/longmemeval_answer_results.json

There is also an extractive smoke run that does not require a model: benchmarks/longmemeval_answer_extractive_20_results.json. It is only a runner check, not a meaningful final answer-quality benchmark.

ANN Index Curve

WaveMind includes a local ANN/VectorDBBench-style curve for candidate indexes. It generates normalized vectors, queries with noisy copies, and measures recall@10 against exact cosine neighbors.

python benchmarks/ann_index_curve_benchmark.py --sizes 1000 5000 10000 50000 --dim 128 --queries 100 --top-k 10 --engines numpy quantized annoy faiss qdrant-local --output benchmarks/ann_index_curve_results.json

Add pgvector to --engines when WAVEMIND_PGVECTOR_DSN points at a PostgreSQL database with pgvector enabled. Add qdrant-service when WAVEMIND_QDRANT_URL points at a running Qdrant service. Add faiss-persisted when WAVEMIND_FAISS_PATH points at the FAISS snapshot file to validate persisted-index startup behavior.

Production profile example:

export WAVEMIND_FAISS_PATH="./state/ann-curve.faiss"
export WAVEMIND_QDRANT_URL="http://localhost:6333"
export WAVEMIND_PGVECTOR_DSN="postgresql://user:password@localhost:5432/wavemind"
python benchmarks/ann_index_curve_benchmark.py --sizes 10000 50000 --dim 128 --queries 100 --top-k 10 --engines faiss-persisted qdrant-service pgvector --output benchmarks/production_index_profile_results.json

Checked-in 50000-vector point:

engine recall@10 avg latency p95 latency build
WaveMind numpy 1.000 6.49 ms 6.41 ms 744.7 ms
WaveMind quantized 0.934 24.92 ms 37.36 ms 2088.7 ms
WaveMind annoy 0.730 4.92 ms 7.37 ms 4090.1 ms
WaveMind faiss skipped - - -
Qdrant local 1.000 43.49 ms 59.68 ms 17525.7 ms

Read this as an engineering curve, not an official VectorDBBench result. Annoy is faster than exact NumPy at 50000 vectors but loses too much recall with the current settings. The new quantized backend compresses vectors and keeps 0.934 recall@10 on this run, but the current Python/NumPy kernel is slower than exact NumPy; it is a memory-footprint baseline, not a latency win yet. FAISS persistence, service-mode Qdrant, and pgvector are now explicit benchmark profiles. If a required package, service, or environment variable is missing, the runner marks that engine as skipped instead of silently falling back to another backend.

Current Local Runs

Field memory dynamics benchmark:

13 memories, 5 conflicting-fact queries, deterministic local encoder. This benchmark isolates the MemoryFieldGraph: related memories can spread activation, newer conflicting memories inhibit stale facts, graph energy decays, and active clusters can surface concept candidates. Full machine-readable result: benchmarks/field_memory_dynamics_results.json.

engine precision@1 precision@3 stale suppression concept formation decay ratio avg latency
WaveMind graph 1.00 1.00 1.00 1.00 0.81 0.82 ms
WaveMind static 0.20 1.00 0.20 0.00 0.00 0.43 ms

Run locally from a cloned repository:

python benchmarks/field_memory_dynamics_benchmark.py

Long-term memory evidence benchmark:

200 memories, 8 evidence queries, same HashingTextEncoder embeddings. This benchmark asks a stricter agent-memory question than static retrieval: did the system return the right evidence while suppressing stale, corrected, expired, or cross-user evidence? Full machine-readable result: benchmarks/long_memory_evidence_results.json.

engine evidence recall@5 precision@1 stale suppression context saved avg latency
WaveMind 1.00 1.00 1.00 0.87 6.10 ms
Static vector 1.00 0.57 0.00 0.88 0.65 ms

Run locally from a cloned repository:

python benchmarks/long_memory_evidence_benchmark.py --dataset synthetic --engines wavemind static --memories 200 --top-k 5

To compare the same normalized benchmark with Chroma or Qdrant, install the benchmark extras and include those engines:

pip install -e ".[bench]"
python benchmarks/long_memory_evidence_benchmark.py --dataset synthetic --engines wavemind chroma qdrant --memories 200 --top-k 5

Real Russian sentences from Tatoeba, 50 one-word queries, NumPy exact index.

metric hash sentence-transformers
precision@1 1.00 1.00
precision@3 1.00 1.00
avg query 0.49 ms 52.84 ms

Capacity check with the hash encoder:

memories precision@1 precision@3 avg query
200 1.00 1.00 0.49 ms
1000 0.88 1.00 1.50 ms
5000 0.72 0.88 5.68 ms

Run locally from a cloned repository:

python benchmarks/ru_sentences_benchmark.py --sentences 200 --queries 50 --encoder hash --index numpy
python benchmarks/ru_sentences_benchmark.py --sentences 200 --queries 50 --encoder sentence --index numpy

Agent-memory benchmark against Chroma:

200 Russian user facts, 50 natural-language questions, same precomputed HashingTextEncoder embeddings for WaveMind and Chroma. Full machine-readable result: benchmarks/agent_memory_results.json.

This is a static retrieval benchmark. It measures baseline ranking and latency, not hotness, TTL, repeated recall, or memory aging.

engine precision@1 precision@3 avg latency
WaveMind 0.82 0.90 2.25 ms
Chroma 0.82 0.88 0.93 ms

WaveMind-only capacity checks from the current ranking path:

scenario memories precision@1 precision@3 avg latency p95 latency
static agent facts 200 0.96 0.98 4.05 ms 8.18 ms
static agent facts 1000 0.96 0.98 3.53 ms 5.20 ms
static agent facts 5000 0.94 0.98 13.71 ms 17.20 ms
dynamic memory policy 200 1.00 1.00 38.40 ms 41.14 ms
dynamic memory policy 1000 1.00 1.00 54.29 ms 72.38 ms
dynamic memory policy 5000 1.00 1.00 48.36 ms 86.13 ms

Machine-readable local capacity result: benchmarks/wavemind_capacity_results.json. These capacity checks are WaveMind-only because the local restricted environment did not have Chroma installed.

Run locally from a cloned repository:

pip install -e ".[bench]"
python benchmarks/agent_memory_benchmark.py --engines wavemind chroma --facts 200 --queries 50

Dynamic agent-memory benchmark:

200 memories, 8 checks, same precomputed HashingTextEncoder embeddings. This benchmark exercises hot memory, TTL, corrections, and namespace isolation. WaveMind applies its built-in memory policy. Chroma static is a plain vector-store baseline without application-layer TTL, delete handling, namespace filters, or recall reinforcement. Full machine-readable result: benchmarks/dynamic_memory_results.json.

engine precision@1 precision@3 stale suppression avg latency
WaveMind 1.00 1.00 1.00 25.26 ms
Chroma static 0.57 1.00 0.00 1.75 ms

Category success:

behavior WaveMind Chroma static
hot memory 1.00 0.50
TTL 1.00 0.00
correction 1.00 0.00
namespace isolation 1.00 0.00

Run locally from a cloned repository:

pip install -e ".[bench]"
python benchmarks/dynamic_memory_benchmark.py --engines wavemind chroma --memories 200

Comparison

feature WaveMind Chroma Qdrant
Primary role Dynamic memory engine Embedding database Production vector database
Local SQLite persistence Yes Yes No, separate service/storage
HTTP API FastAPI included Included Included
Audit log / metrics SQLite audit events plus /metrics App-layer only App-layer / service metrics
Dynamic memory priority Wave-field hotness, TTL, priority Metadata/filter driven Payload/filter driven
Built-in forgetting TTL and explicit forget Manual delete/filtering Manual delete/filtering
Best fit Small to medium memory streams with dynamic recall Local RAG apps and prototypes Large-scale vector search
Scale target today Up to 1000 optimal on NumPy, FAISS recommended beyond 5000 Larger than WaveMind local mode Production scale

WaveMind is not trying to replace dedicated vector databases at scale. The intended product gap is dynamic priority: frequently used memories can become hotter while old or low-priority memories fade. For static RAG over large document collections, use a mature vector database. For memory that needs persistence, scoped recall, TTL, forgetting, and reinforcement, WaveMind is designed to sit above or beside the vector index.

If you already use Chroma for local memory, see the practical migration guide: docs/CHROMA_MIGRATION.md.

Known Limitations

  • Optimal capacity on the current NumPy exact index is up to 1000 records.
  • At 5000 records, one-word precision@1 is currently 0.72 with the hash encoder; many misses are ambiguous queries where another sentence containing the same word ranks first.
  • For N > 5000, the NumPy exact index is still reliable but scales linearly. Annoy is faster at 50000 vectors in the local curve, but current recall is only 0.730; the quantized backend reaches 0.934 recall@10 but is slower than NumPy on the current kernel. Use FAISS or a production vector service before claiming large-scale ANN quality.
  • sentence-transformers/paraphrase-multilingual-mpnet-base-v2 requires about 420 MB of model files. Benchmark runners cache embeddings so retrieval latency is measured separately from model encoding latency.
  • The Chroma comparison currently uses shared precomputed hash embeddings to isolate retrieval/ranking behavior; semantic model comparisons should be run separately.
  • The BEIR SciFact run uses the hash encoder to isolate index/retrieval behavior. It is not a semantic embedding leaderboard result.
  • On BEIR SciFact, WaveMind and Qdrant match on hash-encoder nDCG@10, while Chroma is much faster. The next index milestone is FAISS/Annoy candidate generation plus WaveMind top-k re-ranking.
  • The LoCoMo results are retrieval-only evidence results, not final answer-quality scores. The sentence-transformers run is stronger than the hash run, but still needs answer generation and faithfulness checks.
  • In the 200-fact agent benchmark, Chroma is faster on average while WaveMind is slightly higher at precision@3.
  • The dynamic benchmark currently compares WaveMind against a static Chroma baseline. Chroma and Qdrant can implement similar behavior with extra application-layer metadata policy, deletes, filters, and reinforcement logic.
  • MemoryFieldGraph is a discrete graph over stored memories, not a continuous mathematical field. Its current build path should be optimized with incremental edge updates before large production use.
  • pgvector is a candidate-index backend. PostgreSQL source-of-truth storage is also available separately, but migrations, PITR docs, and service benchmark profiles still need more real deployment coverage.
  • The Qdrant backend is also a candidate-index backend. WaveMind rebuilds it from SQLite on load/build, so large service-mode deployments still need a measured rebuild strategy and index-health monitoring.
  • The persisted FAISS backend validates a snapshot against current memory ids and avoids unnecessary FAISS rebuilds when the snapshot matches. It is still a single-node flat-index path, not distributed HA.
  • The quantized backend is an explicit int8 candidate-index experiment. It reduces vector precision and must be benchmarked per workload before use.
  • The synthetic long-term memory evidence benchmark is useful for regression and product-shape proof, but public claims should lean on LoCoMo and LongMemEval instead.
  • The LongMemEval result is retrieval-only. It is not a full LongMemEval answer-generation leaderboard-equivalent score.
  • Qdrant baselines in this README use embedded local mode. Qdrant itself warns that local mode is not recommended above 20000 points; use the qdrant-service benchmark profile before making production latency claims.
  • MTEB, MIRACL, LMEB, official VectorDBBench, and RAGBench are listed as the public benchmark roadmap, not as completed results yet.
  • Ollama answer generation is implemented, but the current machine has no local Ollama model available and the local Ollama API returns 502/connection-reset. The checked-in answer file is extractive smoke only, not an LLM score.
  • Public benchmark adapters require optional datasets, heavier dependencies, or running services. They are intentionally outside the minimal pip install wavemind path.
  • Dynamic memory is slower than static Chroma in the current local benchmark: 25.26 ms vs 1.75 ms average query latency on this machine.
  • Current WaveMind-only dynamic checks keep precision@1 at 1.00 through 5000 memories, but average latency is around 48-54 ms. The next optimization target is field/re-ranking latency, not basic recall quality.

Roadmap

Full roadmap: docs/ROADMAP.md. Launch and positioning kit: docs/LAUNCH_KIT.md.

Near-term priorities:

  • Service-mode Qdrant, pgvector, and persisted-FAISS benchmark runs on a real production-like machine.
  • Migration tooling and operational docs for Postgres source-of-truth storage.
  • Tune the new quantized int8 backend so compression does not cost more latency than exact NumPy on common workloads.
  • Service-mode Qdrant and FAISS latency baselines using the explicit Qdrant backend, not only the standalone Qdrant benchmark baseline.
  • LoCoMo and LongMemEval answer-quality evaluation, not retrieval only.
  • Harden framework adapters: LangGraph, LlamaIndex, CrewAI, AutoGen, OpenClaw, and HTTP-only sidecar use.
  • Faster dynamic re-ranking through smaller candidate windows, caching, and background updates.
  • Better production operations: OpenTelemetry is optional and implemented; richer latency histograms, index-health metrics, alerting examples, and restore drills are next.

Longer-term direction:

  • scale from thousands of memories to 100k-1M on one node;
  • keep SQLite as the local source of truth while adding Postgres and external vector backends for production;
  • evolve MemoryFieldGraph from a regression-tested graph into a stronger field-memory model with excitation, inhibition, decay, and consolidation;
  • build enterprise features only after benchmarked retrieval, latency, and answer-quality evidence are solid.

Contributing

Contributing guide: CONTRIBUTING.md.

Useful contribution paths:

  • add reproducible benchmark adapters and checked-in result JSON;
  • improve FAISS, Qdrant, pgvector, or other candidate-index backends;
  • add examples for LangGraph, LlamaIndex, CrewAI, AutoGen, OpenClaw, and HTTP-only sidecar deployments;
  • improve dynamic memory behavior around TTL, corrections, namespaces, graph excitation/inhibition, and consolidation;
  • harden production operations: backups, audit logs, metrics, tracing, and migration tools.

GitHub issue templates are included for bugs, features, benchmarks, and integrations. Benchmark claims need a reproduction command and committed result artifact before they are added to README.

License

MIT. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wavemind-2.2.1.tar.gz (217.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wavemind-2.2.1-py3-none-any.whl (71.9 kB view details)

Uploaded Python 3

File details

Details for the file wavemind-2.2.1.tar.gz.

File metadata

  • Download URL: wavemind-2.2.1.tar.gz
  • Upload date:
  • Size: 217.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for wavemind-2.2.1.tar.gz
Algorithm Hash digest
SHA256 e3adfedb8ff4a9ea9053d34f022e182bed705585ee2811b01c0e5b67dc616f1c
MD5 1bc9f6413de539dec96126d0802e1683
BLAKE2b-256 bbec73cb0c5815f0caeafb4bf8c0b336979004ef71fdb1b73f31225afe892ce9

See more details on using hashes here.

Provenance

The following attestation bundles were made for wavemind-2.2.1.tar.gz:

Publisher: publish.yml on CaspianG/wavemind

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file wavemind-2.2.1-py3-none-any.whl.

File metadata

  • Download URL: wavemind-2.2.1-py3-none-any.whl
  • Upload date:
  • Size: 71.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for wavemind-2.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1bd5d7f3ae6b69706838c0e89ca99b86ecb9d54f04f79eb3206195305bbdd41d
MD5 d7dcaddfce7088e68064841dca79657a
BLAKE2b-256 2f437f75c306f01ea058332aa485eaad1f4a4f8e50d99c12326d0d26121df743

See more details on using hashes here.

Provenance

The following attestation bundles were made for wavemind-2.2.1-py3-none-any.whl:

Publisher: publish.yml on CaspianG/wavemind

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page