Skip to main content

Single-file AI memory system for Python. Store, search, and query documents with built-in RAG.

Project description

memvid-sdk

Single-file AI memory system for Python. Store documents, search with BM25 ranking, and run retrieval-augmented generation (RAG) queries from a portable .mv2 file.

Built on Rust with PyO3 bindings. No database setup, no network dependencies, no configuration files.

Install

pip install memvid-sdk

Optional extras for framework integrations:

pip install "memvid-sdk[langchain]"     # LangChain tools
pip install "memvid-sdk[llamaindex]"    # LlamaIndex query engine
pip install "memvid-sdk[openai]"        # OpenAI function schemas
pip install "memvid-sdk[crewai]"        # CrewAI tools
pip install "memvid-sdk[full]"          # All integrations

Quick start

from memvid_sdk import use, create

# Create a new memory file
mv = use("basic", "notes.mv2", mode="auto")

# Store a document
mv.put(
    title="Project kickoff",
    label="meeting",
    metadata={"date": "2024-01-15"},
    text="Discussed timeline, assigned tasks to team members.",
)

# Search by keyword
results = mv.find("timeline")
print(results["hits"])

# Ask a question (retrieves relevant context)
answer = mv.ask("What was discussed in the kickoff?")
print(answer["context"])

# Commit changes
mv.seal()

Embeddings & semantic search

Semantic search (mode="sem") and hybrid search (mode="auto") require a vector index:

from memvid_sdk import create

mv = create("notes.mv2", enable_vec=True)

Generate embeddings during ingestion (local or OpenAI)

mv.put(
    title="Doc",
    label="note",
    metadata={},
    text="alpha fastembed test",
    enable_embedding=True,
    embedding_model="bge-small",  # local (fastembed)
)

Batch mode:

mv.put_many(
    [{"title": "Doc", "label": "note", "text": "alpha"}],
    opts={"enable_embedding": True, "embedding_model": "bge-small"},
)

Supported embedding_model values: bge-small, bge-base, nomic, gte-large, openai-small, openai-large, openai-ada.

Bring your own embeddings (precomputed)

Store embedding identity metadata so semantic queries can auto-detect the right model later:

mv.put_many(
    [{"title": "Doc", "label": "note", "text": "alpha"}],
    embeddings=[[0.1, 0.2, 0.3, 0.4]],
    embedding_identity={"provider": "custom", "model": "my-embedder-v1"},
)

Use an embedder (SDK-managed query embeddings)

from memvid_sdk.embeddings import HashEmbeddings

embedder = HashEmbeddings(dimension=32)  # deterministic offline embedder for tests
mv.put_many(
    [{"title": "Doc", "label": "note", "text": "alpha"}],
    embedder=embedder,
)

mv.find("alpha", mode="sem", embedder=embedder)
mv.ask("alpha", mode="sem", context_only=True, embedder=embedder)

Built-in embedders: OpenAIEmbeddings (OPENAI_API_KEY), CohereEmbeddings (COHERE_API_KEY), VoyageEmbeddings (VOYAGE_API_KEY), NvidiaEmbeddings (NVIDIA_API_KEY), HuggingFaceEmbeddings (local), HashEmbeddings (offline deterministic).

from memvid_sdk.embeddings import NvidiaEmbeddings

embedder = NvidiaEmbeddings(model="nvidia/nv-embed-v1")  # uses NVIDIA_API_KEY
mv.put_many([{"title": "Doc", "label": "note", "text": "alpha"}], embedder=embedder)
mv.find("alpha", mode="sem", embedder=embedder)

Auto query embeddings (no query_embedding required)

If the memory contains memvid.embedding.* identity metadata (recommended) or a known vector dimension (fallback), semantic queries can embed the query automatically:

mv.find("alpha", mode="sem")
mv.find("alpha", mode="sem", query_embedding_model="bge-small")  # override/force

Core API

Opening a memory

mv = use(kind, path, apikey=None, **options)
Parameter Type Description
kind str Adapter type: "basic", "langchain", "llamaindex", "openai", "crewai", "vercel-ai", "autogen"
path str Path to .mv2 file
mode str "open" (default), "create", or "auto"
enable_lex bool Enable lexical index (default: True)
enable_vec bool Enable vector index (default: False)
read_only bool Open in read-only mode (default: False)
lock_timeout_ms int Lock acquisition timeout in milliseconds (default: 250)
force str | None Set to "stale_only" to force-release stale locks
force_writable bool Open read-only first, then re-open writable (best-effort)

Context manager support:

with use("basic", "notes.mv2") as mv:
    mv.put(title="Note", label="general", metadata={}, text="Content")
# File handle automatically closed

Storing documents

frame_id = mv.put(
    title="Document title",
    label="category",
    metadata={"key": "value"},
    text="Document content...",
    uri="mv2://custom/path",
    tags=["tag1", "tag2"],
)

Batch ingestion

For bulk imports, put_many processes documents in parallel with 100x+ speedup over individual calls:

docs = [
    {"title": "Doc 1", "label": "news", "text": "First document..."},
    {"title": "Doc 2", "label": "news", "text": "Second document..."},
    # ... thousands more
]

frame_ids = mv.put_many(docs, opts={"compression_level": 3})
print(f"Ingested {len(frame_ids)} documents")

Searching

results = mv.find(query, k=5, snippet_chars=240, scope=None, mode=None)
Parameter Type Description
query str Search query
k int Number of results (default: 5)
snippet_chars int Snippet length (default: 240)
scope str Filter by URI prefix
mode str "auto", "lex", or "sem"

Semantic/hybrid options (when mode != "lex"):

Parameter Type Description
query_embedding list[float] | None Precomputed query embedding
query_embedding_model str | None Force embedding model for auto query embeddings
adaptive bool | None Enable adaptive retrieval cutoff
min_relevancy float | None Minimum relevancy (default: 0.5 when adaptive)
max_k int | None Maximum results (default: 100 when adaptive)
adaptive_strategy str | None One of: relative, absolute, cliff, elbow, combined

Retrieval-augmented generation

response = mv.ask(
    question,
    k=6,
    mode="auto",
    model="openai:gpt-4o-mini",
    api_key=os.environ.get("OPENAI_API_KEY"),
    context_only=False,
    mask_pii=False,
)
Parameter Type Description
question str Question to answer
k int Documents to retrieve (default: 6)
mode str "auto", "lex", or "sem"
model str LLM for synthesis (e.g., "openai:gpt-4o-mini", "nvidia:meta/llama3-8b-instruct")
api_key str API key for the LLM provider
context_only bool Skip synthesis, return context only
mask_pii bool Redact PII from response

Semantic/hybrid options (when mode != "lex"):

Parameter Type Description
query_embedding list[float] | None Precomputed query embedding
query_embedding_model str | None Force embedding model for auto query embeddings
adaptive bool | None Enable adaptive retrieval cutoff
min_relevancy float | None Minimum relevancy (default: 0.5 when adaptive)
max_k int | None Maximum results (default: 100 when adaptive)
adaptive_strategy str | None One of: relative, absolute, cliff, elbow, combined

Best practices: Adaptive retrieval

For best search quality, enable adaptive retrieval with the combined strategy. This dynamically adjusts result counts based on relevance scores rather than returning a fixed k:

# Recommended for find()
results = mv.find(
    "your query",
    mode="sem",  # or "auto"
    adaptive=True,
    adaptive_strategy="combined",
)

# Recommended for ask()
answer = mv.ask(
    "your question",
    mode="auto",
    adaptive=True,
    adaptive_strategy="combined",
)

The combined strategy uses both relative thresholds and score cliff detection to filter out low-relevance results, providing higher quality context for RAG applications.

Timeline queries

entries = mv.timeline(limit=100, since=1704067200, reverse=True)

Statistics

stats = mv.stats()
# {'frame_count': 42, 'size_bytes': 1048576, 'has_lex_index': True, ...}

File operations

# Commit pending changes
mv.seal()

# Explicit commit without sealing
mv.commit()

# Verify file integrity
report = mv.verify(deep=True)
report = Memvid.verify("notes.mv2", deep=True)  # also supported

# Repair indexes
result = mv.doctor(dry_run=True)
result = Memvid.doctor("notes.mv2", rebuild_time_index=True)

Diagnostics

from memvid_sdk import info, lock_who, lock_nudge, verify_single_file

print(info())                       # versions + enabled features
print(lock_who("notes.mv2"))        # lock status + owner (if locked)
print(lock_nudge("notes.mv2"))      # request stale lock release
verify_single_file("notes.mv2")     # ensure no sidecar files exist

Framework adapters

Adapters expose framework-native tools when the corresponding dependency is installed.

LangChain

mv = use("langchain", "notes.mv2")
tools = mv.tools  # List of StructuredTool instances

LlamaIndex

mv = use("llamaindex", "notes.mv2")
engine = mv.as_query_engine()
response = engine.query("What is the timeline?")

OpenAI function calling

mv = use("openai", "notes.mv2")
functions = mv.functions  # JSON schemas for tool_calls

CrewAI

mv = use("crewai", "notes.mv2")
tools = mv.tools  # CrewAI-compatible tools

Table extraction

Extract structured tables from PDFs:

result = mv.put_pdf_tables("report.pdf", embed_rows=True)
print(f"Extracted {result['tables_count']} tables")

tables = mv.list_tables()
data = mv.get_table(tables[0]["table_id"], format="dict")

Error handling

Typed exceptions with stable codes:

Code Exception Description
MV001 CapacityExceededError Storage capacity exceeded
MV002 TicketInvalidError Invalid ticket signature
MV003 TicketReplayError Ticket replay detected
MV004 LexIndexDisabledError Lexical index not enabled
MV005 TimeIndexMissingError Time index missing
MV006 VerifyFailedError Verification failed
MV007 LockedError File locked by another process
MV008 ApiKeyRequiredError API key required
MV009 MemoryAlreadyBoundError File bound to another memory
MV010 FrameNotFoundError Frame not found
MV011 VecIndexDisabledError Vector index not enabled
MV012 CorruptFileError Corrupt file / invalid TOC
MV013 FileNotFoundError File not found
MV014 VecDimensionMismatchError Vector dimension mismatch
MV015 EmbeddingFailedError Embedding failed
MV016 ClipIndexDisabledError CLIP index not enabled
MV017 NerModelNotAvailableError NER model not available
from memvid_sdk import CapacityExceededError

try:
    mv.put(...)
except CapacityExceededError:
    print("Out of storage capacity")

Environment variables

Variable Description
MEMVID_API_KEY API key for capacity beyond free tier
MEMVID_API_URL Control plane URL (for enterprise deployments)
MEMVID_OFFLINE Set to 1 to disable network providers and model downloads
MEMVID_MODELS_DIR Path to local embedding model cache (fastembed)
MEMVID_CLIP_MODEL Local CLIP model name (default: mobileclip-s2)
OPENAI_API_KEY API key for OpenAI embeddings (and can be used for ask() examples)
OPENAI_BASE_URL Optional OpenAI-compatible base URL for embeddings
NVIDIA_API_KEY API key for NVIDIA Integrate embeddings
NVIDIA_BASE_URL Optional NVIDIA Integrate base URL (default: https://integrate.api.nvidia.com)
NVIDIA_EMBEDDING_MODEL Optional NVIDIA embedding model override

Development

Build from source:

cd proprietary/memvid-bindings/python

# Create virtual environment
python -m venv .venv
source .venv/bin/activate

# Install dev dependencies
pip install maturin

# Build native extension
maturin develop --release --skip-install

# Run tests
./.venv/bin/python -m pytest -q

# Smoke-run offline-safe examples
./.venv/bin/python scripts/test_examples.py

# Optional: run gated embedding runtime tests
MEMVID_TEST_FASTEMBED=1 ./.venv/bin/python -m pytest -q tests/test_runtime_embeddings_gated.py
MEMVID_TEST_OPENAI=1 OPENAI_API_KEY=... ./.venv/bin/python -m pytest -q tests/test_runtime_embeddings_gated.py
MEMVID_TEST_NVIDIA=1 NVIDIA_API_KEY=... ./.venv/bin/python -m pytest -q tests/test_runtime_embeddings_gated.py

# Optional: run gated local model tests (requires `memvid models install ...`)
MEMVID_TEST_CLIP=1 ./.venv/bin/python -m pytest -q tests/test_phase2_models_tables.py
MEMVID_TEST_NER=1 ./.venv/bin/python -m pytest -q tests/test_phase2_models_tables.py

Requirements

  • Python 3.8+
  • macOS (Apple Silicon or Intel), Linux (x64), or Windows (x64)

License

Apache-2.0. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

memvid_sdk-2.0.129-cp38-abi3-win_amd64.whl (13.8 MB view details)

Uploaded CPython 3.8+Windows x86-64

memvid_sdk-2.0.129-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (20.3 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ x86-64

memvid_sdk-2.0.129-cp38-abi3-macosx_11_0_x86_64.whl (15.8 MB view details)

Uploaded CPython 3.8+macOS 11.0+ x86-64

memvid_sdk-2.0.129-cp38-abi3-macosx_11_0_arm64.whl (14.1 MB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

File details

Details for the file memvid_sdk-2.0.129-cp38-abi3-win_amd64.whl.

File metadata

  • Download URL: memvid_sdk-2.0.129-cp38-abi3-win_amd64.whl
  • Upload date:
  • Size: 13.8 MB
  • Tags: CPython 3.8+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for memvid_sdk-2.0.129-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 a4361b2cc72c20e31f7714dd370fe412cdad00f319a7eea66a88e7f624078152
MD5 aa5239d94d8dd3e81dfe0b3353ebfebd
BLAKE2b-256 1fe2fae3e31e3746eea69c5762434c324fc738961548c1876067dcbcf8cdd3d3

See more details on using hashes here.

File details

Details for the file memvid_sdk-2.0.129-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for memvid_sdk-2.0.129-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 95ac3b33286def66ee7a0688d3d63d13eea505776a131e5e7310f7068f8ad168
MD5 4ba05bec3c6abc05a897c62822ba8295
BLAKE2b-256 af03a774804e8b1bd5de9647e2a7a7f0b5c67539e9bc7a0c35752c25f1a6efde

See more details on using hashes here.

File details

Details for the file memvid_sdk-2.0.129-cp38-abi3-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for memvid_sdk-2.0.129-cp38-abi3-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 33ff5e777fc18c5b81b0fdfa7fb2eeb4ff720307f12906a3fc81407ae6915b2e
MD5 fd79bcac2f711451fbc1d9ef28506bc1
BLAKE2b-256 9cc08a1070a5cac78a0e8eba6747189e39fe22da9adec45ded77fd7c3f956006

See more details on using hashes here.

File details

Details for the file memvid_sdk-2.0.129-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for memvid_sdk-2.0.129-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a8049419cef7affa895ed4a9cf273f380614b8427e60484c6e2e7ed7a42a265a
MD5 981adce85c99e077ea5f499c3ecb04b1
BLAKE2b-256 9f170e1b99ad36e9bfdc9ff131f88b711260880ec8ba1edc1a714e1288163cd1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page