Skip to main content

Single-file AI memory system for Python. Store, search, and query documents with built-in RAG.

Project description

memvid-sdk

Single-file AI memory system for Python. Store documents, search with BM25 ranking, and run retrieval-augmented generation (RAG) queries from a portable .mv2 file.

Built on Rust with PyO3 bindings. No database setup, no network dependencies, no configuration files.

Install

pip install memvid-sdk

Optional extras for framework integrations:

pip install "memvid-sdk[langchain]"     # LangChain tools
pip install "memvid-sdk[llamaindex]"    # LlamaIndex query engine
pip install "memvid-sdk[openai]"        # OpenAI function schemas
pip install "memvid-sdk[crewai]"        # CrewAI tools
pip install "memvid-sdk[full]"          # All integrations

Quick start

from memvid_sdk import use, create

# Create a new memory file
mv = use("basic", "notes.mv2", mode="auto")

# Store a document
mv.put(
    title="Project kickoff",
    label="meeting",
    metadata={"date": "2024-01-15"},
    text="Discussed timeline, assigned tasks to team members.",
)

# Search by keyword
results = mv.find("timeline")
print(results["hits"])

# Ask a question (retrieves relevant context)
answer = mv.ask("What was discussed in the kickoff?")
print(answer["context"])

# Commit changes
mv.seal()

Embeddings & semantic search

Semantic search (mode="sem") and hybrid search (mode="auto") require a vector index:

from memvid_sdk import create

mv = create("notes.mv2", enable_vec=True)

Generate embeddings during ingestion (local or OpenAI)

mv.put(
    title="Doc",
    label="note",
    metadata={},
    text="alpha fastembed test",
    enable_embedding=True,
    embedding_model="bge-small",  # local (fastembed)
)

Batch mode:

mv.put_many(
    [{"title": "Doc", "label": "note", "text": "alpha"}],
    opts={"enable_embedding": True, "embedding_model": "bge-small"},
)

Supported embedding_model values: bge-small, bge-base, nomic, gte-large, openai-small, openai-large, openai-ada.

Bring your own embeddings (precomputed)

Store embedding identity metadata so semantic queries can auto-detect the right model later:

mv.put_many(
    [{"title": "Doc", "label": "note", "text": "alpha"}],
    embeddings=[[0.1, 0.2, 0.3, 0.4]],
    embedding_identity={"provider": "custom", "model": "my-embedder-v1"},
)

Use an embedder (SDK-managed query embeddings)

from memvid_sdk.embeddings import HashEmbeddings

embedder = HashEmbeddings(dimension=32)  # deterministic offline embedder for tests
mv.put_many(
    [{"title": "Doc", "label": "note", "text": "alpha"}],
    embedder=embedder,
)

mv.find("alpha", mode="sem", embedder=embedder)
mv.ask("alpha", mode="sem", context_only=True, embedder=embedder)

Built-in embedders: OpenAIEmbeddings (OPENAI_API_KEY), CohereEmbeddings (COHERE_API_KEY), VoyageEmbeddings (VOYAGE_API_KEY), NvidiaEmbeddings (NVIDIA_API_KEY), HuggingFaceEmbeddings (local), HashEmbeddings (offline deterministic).

from memvid_sdk.embeddings import NvidiaEmbeddings

embedder = NvidiaEmbeddings(model="nvidia/nv-embed-v1")  # uses NVIDIA_API_KEY
mv.put_many([{"title": "Doc", "label": "note", "text": "alpha"}], embedder=embedder)
mv.find("alpha", mode="sem", embedder=embedder)

Auto query embeddings (no query_embedding required)

If the memory contains memvid.embedding.* identity metadata (recommended) or a known vector dimension (fallback), semantic queries can embed the query automatically:

mv.find("alpha", mode="sem")
mv.find("alpha", mode="sem", query_embedding_model="bge-small")  # override/force

Core API

Opening a memory

mv = use(kind, path, apikey=None, **options)
Parameter Type Description
kind str Adapter type: "basic", "langchain", "llamaindex", "openai", "crewai", "vercel-ai", "autogen"
path str Path to .mv2 file
mode str "open" (default), "create", or "auto"
enable_lex bool Enable lexical index (default: True)
enable_vec bool Enable vector index (default: False)
read_only bool Open in read-only mode (default: False)
lock_timeout_ms int Lock acquisition timeout in milliseconds (default: 250)
force str | None Set to "stale_only" to force-release stale locks
force_writable bool Open read-only first, then re-open writable (best-effort)

Context manager support:

with use("basic", "notes.mv2") as mv:
    mv.put(title="Note", label="general", metadata={}, text="Content")
# File handle automatically closed

Storing documents

frame_id = mv.put(
    title="Document title",
    label="category",
    metadata={"key": "value"},
    text="Document content...",
    uri="mv2://custom/path",
    tags=["tag1", "tag2"],
)

Batch ingestion

For bulk imports, put_many processes documents in parallel with 100x+ speedup over individual calls:

docs = [
    {"title": "Doc 1", "label": "news", "text": "First document..."},
    {"title": "Doc 2", "label": "news", "text": "Second document..."},
    # ... thousands more
]

frame_ids = mv.put_many(docs, opts={"compression_level": 3})
print(f"Ingested {len(frame_ids)} documents")

Searching

results = mv.find(query, k=5, snippet_chars=240, scope=None, mode=None)
Parameter Type Description
query str Search query
k int Number of results (default: 5)
snippet_chars int Snippet length (default: 240)
scope str Filter by URI prefix
mode str "auto", "lex", or "sem"

Semantic/hybrid options (when mode != "lex"):

Parameter Type Description
query_embedding list[float] | None Precomputed query embedding
query_embedding_model str | None Force embedding model for auto query embeddings
adaptive bool | None Enable adaptive retrieval cutoff
min_relevancy float | None Minimum relevancy (default: 0.5 when adaptive)
max_k int | None Maximum results (default: 100 when adaptive)
adaptive_strategy str | None One of: relative, absolute, cliff, elbow, combined

Retrieval-augmented generation

response = mv.ask(
    question,
    k=6,
    mode="auto",
    model="openai:gpt-4o-mini",
    api_key=os.environ.get("OPENAI_API_KEY"),
    context_only=False,
    mask_pii=False,
)
Parameter Type Description
question str Question to answer
k int Documents to retrieve (default: 6)
mode str "auto", "lex", or "sem"
model str LLM for synthesis (e.g., "openai:gpt-4o-mini", "nvidia:meta/llama3-8b-instruct")
api_key str API key for the LLM provider
context_only bool Skip synthesis, return context only
mask_pii bool Redact PII from response

Semantic/hybrid options (when mode != "lex"):

Parameter Type Description
query_embedding list[float] | None Precomputed query embedding
query_embedding_model str | None Force embedding model for auto query embeddings
adaptive bool | None Enable adaptive retrieval cutoff
min_relevancy float | None Minimum relevancy (default: 0.5 when adaptive)
max_k int | None Maximum results (default: 100 when adaptive)
adaptive_strategy str | None One of: relative, absolute, cliff, elbow, combined

Best practices: Adaptive retrieval

For best search quality, enable adaptive retrieval with the combined strategy. This dynamically adjusts result counts based on relevance scores rather than returning a fixed k:

# Recommended for find()
results = mv.find(
    "your query",
    mode="sem",  # or "auto"
    adaptive=True,
    adaptive_strategy="combined",
)

# Recommended for ask()
answer = mv.ask(
    "your question",
    mode="auto",
    adaptive=True,
    adaptive_strategy="combined",
)

The combined strategy uses both relative thresholds and score cliff detection to filter out low-relevance results, providing higher quality context for RAG applications.

Timeline queries

entries = mv.timeline(limit=100, since=1704067200, reverse=True)

Statistics

stats = mv.stats()
# {'frame_count': 42, 'size_bytes': 1048576, 'has_lex_index': True, ...}

File operations

# Commit pending changes
mv.seal()

# Explicit commit without sealing
mv.commit()

# Verify file integrity
report = mv.verify(deep=True)
report = Memvid.verify("notes.mv2", deep=True)  # also supported

# Repair indexes
result = mv.doctor(dry_run=True)
result = Memvid.doctor("notes.mv2", rebuild_time_index=True)

Diagnostics

from memvid_sdk import info, lock_who, lock_nudge, verify_single_file

print(info())                       # versions + enabled features
print(lock_who("notes.mv2"))        # lock status + owner (if locked)
print(lock_nudge("notes.mv2"))      # request stale lock release
verify_single_file("notes.mv2")     # ensure no sidecar files exist

Framework adapters

Adapters expose framework-native tools when the corresponding dependency is installed.

LangChain

mv = use("langchain", "notes.mv2")
tools = mv.tools  # List of StructuredTool instances

LlamaIndex

mv = use("llamaindex", "notes.mv2")
engine = mv.as_query_engine()
response = engine.query("What is the timeline?")

OpenAI function calling

mv = use("openai", "notes.mv2")
functions = mv.functions  # JSON schemas for tool_calls

CrewAI

mv = use("crewai", "notes.mv2")
tools = mv.tools  # CrewAI-compatible tools

Table extraction

Extract structured tables from PDFs:

result = mv.put_pdf_tables("report.pdf", embed_rows=True)
print(f"Extracted {result['tables_count']} tables")

tables = mv.list_tables()
data = mv.get_table(tables[0]["table_id"], format="dict")

Error handling

Typed exceptions with stable codes:

Code Exception Description
MV001 CapacityExceededError Storage capacity exceeded
MV002 TicketInvalidError Invalid ticket signature
MV003 TicketReplayError Ticket replay detected
MV004 LexIndexDisabledError Lexical index not enabled
MV005 TimeIndexMissingError Time index missing
MV006 VerifyFailedError Verification failed
MV007 LockedError File locked by another process
MV008 ApiKeyRequiredError API key required
MV009 MemoryAlreadyBoundError File bound to another memory
MV010 FrameNotFoundError Frame not found
MV011 VecIndexDisabledError Vector index not enabled
MV012 CorruptFileError Corrupt file / invalid TOC
MV013 FileNotFoundError File not found
MV014 VecDimensionMismatchError Vector dimension mismatch
MV015 EmbeddingFailedError Embedding failed
MV016 ClipIndexDisabledError CLIP index not enabled
MV017 NerModelNotAvailableError NER model not available
from memvid_sdk import CapacityExceededError

try:
    mv.put(...)
except CapacityExceededError:
    print("Out of storage capacity")

Environment variables

Variable Description
MEMVID_API_KEY API key for capacity beyond free tier
MEMVID_API_URL Control plane URL (for enterprise deployments)
MEMVID_OFFLINE Set to 1 to disable network providers and model downloads
MEMVID_MODELS_DIR Path to local embedding model cache (fastembed)
MEMVID_CLIP_MODEL Local CLIP model name (default: mobileclip-s2)
OPENAI_API_KEY API key for OpenAI embeddings (and can be used for ask() examples)
OPENAI_BASE_URL Optional OpenAI-compatible base URL for embeddings
NVIDIA_API_KEY API key for NVIDIA Integrate embeddings
NVIDIA_BASE_URL Optional NVIDIA Integrate base URL (default: https://integrate.api.nvidia.com)
NVIDIA_EMBEDDING_MODEL Optional NVIDIA embedding model override

Development

Build from source:

cd proprietary/memvid-bindings/python

# Create virtual environment
python -m venv .venv
source .venv/bin/activate

# Install dev dependencies
pip install maturin

# Build native extension
maturin develop --release --skip-install

# Run tests
./.venv/bin/python -m pytest -q

# Smoke-run offline-safe examples
./.venv/bin/python scripts/test_examples.py

# Optional: run gated embedding runtime tests
MEMVID_TEST_FASTEMBED=1 ./.venv/bin/python -m pytest -q tests/test_runtime_embeddings_gated.py
MEMVID_TEST_OPENAI=1 OPENAI_API_KEY=... ./.venv/bin/python -m pytest -q tests/test_runtime_embeddings_gated.py
MEMVID_TEST_NVIDIA=1 NVIDIA_API_KEY=... ./.venv/bin/python -m pytest -q tests/test_runtime_embeddings_gated.py

# Optional: run gated local model tests (requires `memvid models install ...`)
MEMVID_TEST_CLIP=1 ./.venv/bin/python -m pytest -q tests/test_phase2_models_tables.py
MEMVID_TEST_NER=1 ./.venv/bin/python -m pytest -q tests/test_phase2_models_tables.py

Requirements

  • Python 3.8+
  • macOS (Apple Silicon or Intel), Linux (x64), or Windows (x64)

License

Apache-2.0. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

memvid_sdk-2.0.112-cp38-abi3-win_amd64.whl (13.8 MB view details)

Uploaded CPython 3.8+Windows x86-64

memvid_sdk-2.0.112-cp38-abi3-manylinux_2_35_x86_64.whl (19.4 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.35+ x86-64

memvid_sdk-2.0.112-cp38-abi3-macosx_11_0_arm64.whl (13.8 MB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

memvid_sdk-2.0.112-cp38-abi3-macosx_10_12_x86_64.whl (15.5 MB view details)

Uploaded CPython 3.8+macOS 10.12+ x86-64

File details

Details for the file memvid_sdk-2.0.112-cp38-abi3-win_amd64.whl.

File metadata

  • Download URL: memvid_sdk-2.0.112-cp38-abi3-win_amd64.whl
  • Upload date:
  • Size: 13.8 MB
  • Tags: CPython 3.8+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for memvid_sdk-2.0.112-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 14815d796d0d1869e5fe849f98906a8025250471d4b7cb63717b15927bd8fd85
MD5 c10f6fc73ac6b848f8a1e01ce134832d
BLAKE2b-256 42ac50120b4d6d0d3a645235b9b1c83e2da5ec4b2888af5ec68f9392e0e311da

See more details on using hashes here.

File details

Details for the file memvid_sdk-2.0.112-cp38-abi3-manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for memvid_sdk-2.0.112-cp38-abi3-manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 802bbed50defbc5bbd262dcb03498f097ed77e1ecdd1f96a6590207ac9e09fa1
MD5 817e70b1c01ec03b07e91a4413e05a43
BLAKE2b-256 215c7f1b7a6d9171c9242d9ad7112b5309ac4aee514d19742c2fb79ef262cf80

See more details on using hashes here.

File details

Details for the file memvid_sdk-2.0.112-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for memvid_sdk-2.0.112-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 bc07d5a2cd52e9452cfe847529f0951f58e6bd619902a7d9f14438d6b6b53c3a
MD5 d00ff05b85033ba46a30f18314152fb0
BLAKE2b-256 5210e4d0036f85289e57f0f5c87130169338b19fe613e59b789bc57bdeb922b6

See more details on using hashes here.

File details

Details for the file memvid_sdk-2.0.112-cp38-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for memvid_sdk-2.0.112-cp38-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 77c71e419411b9e87f7a66fd0dda64dd1daa4d84205859ffd109637b038557a6
MD5 c667c1beac9a720c6c21df8e0153ddcf
BLAKE2b-256 28ddb831ea13241d54ac768d8d0b25e98aec68e3d70697d1f54ca6da0915a1f4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page