Skip to main content

Single-file AI memory system for Python. Store, search, and query documents with built-in RAG.

Project description

memvid-sdk

Single-file AI memory system for Python. Store documents, search with BM25 ranking, and run retrieval-augmented generation (RAG) queries from a portable .mv2 file.

Built on Rust with PyO3 bindings. No database setup, no network dependencies, no configuration files.

Install

pip install memvid-sdk

Optional extras for framework integrations:

pip install "memvid-sdk[langchain]"     # LangChain tools
pip install "memvid-sdk[llamaindex]"    # LlamaIndex query engine
pip install "memvid-sdk[openai]"        # OpenAI function schemas
pip install "memvid-sdk[crewai]"        # CrewAI tools
pip install "memvid-sdk[full]"          # All integrations

Quick start

from memvid_sdk import use, create

# Create a new memory file
mv = use("basic", "notes.mv2", mode="auto")

# Store a document
mv.put(
    title="Project kickoff",
    label="meeting",
    metadata={"date": "2024-01-15"},
    text="Discussed timeline, assigned tasks to team members.",
)

# Search by keyword
results = mv.find("timeline")
print(results["hits"])

# Ask a question (retrieves relevant context)
answer = mv.ask("What was discussed in the kickoff?")
print(answer["context"])

# Commit changes
mv.seal()

Embeddings & semantic search

Semantic search (mode="sem") and hybrid search (mode="auto") require a vector index:

from memvid_sdk import create

mv = create("notes.mv2", enable_vec=True)

Generate embeddings during ingestion (local or OpenAI)

mv.put(
    title="Doc",
    label="note",
    metadata={},
    text="alpha fastembed test",
    enable_embedding=True,
    embedding_model="bge-small",  # local (fastembed)
)

Batch mode:

mv.put_many(
    [{"title": "Doc", "label": "note", "text": "alpha"}],
    opts={"enable_embedding": True, "embedding_model": "bge-small"},
)

Supported embedding_model values: bge-small, bge-base, nomic, gte-large, openai-small, openai-large, openai-ada.

Bring your own embeddings (precomputed)

Store embedding identity metadata so semantic queries can auto-detect the right model later:

mv.put_many(
    [{"title": "Doc", "label": "note", "text": "alpha"}],
    embeddings=[[0.1, 0.2, 0.3, 0.4]],
    embedding_identity={"provider": "custom", "model": "my-embedder-v1"},
)

Use an embedder (SDK-managed query embeddings)

from memvid_sdk.embeddings import HashEmbeddings

embedder = HashEmbeddings(dimension=32)  # deterministic offline embedder for tests
mv.put_many(
    [{"title": "Doc", "label": "note", "text": "alpha"}],
    embedder=embedder,
)

mv.find("alpha", mode="sem", embedder=embedder)
mv.ask("alpha", mode="sem", context_only=True, embedder=embedder)

Built-in embedders: OpenAIEmbeddings (OPENAI_API_KEY), CohereEmbeddings (COHERE_API_KEY), VoyageEmbeddings (VOYAGE_API_KEY), NvidiaEmbeddings (NVIDIA_API_KEY), HuggingFaceEmbeddings (local), HashEmbeddings (offline deterministic).

from memvid_sdk.embeddings import NvidiaEmbeddings

embedder = NvidiaEmbeddings(model="nvidia/nv-embed-v1")  # uses NVIDIA_API_KEY
mv.put_many([{"title": "Doc", "label": "note", "text": "alpha"}], embedder=embedder)
mv.find("alpha", mode="sem", embedder=embedder)

Auto query embeddings (no query_embedding required)

If the memory contains memvid.embedding.* identity metadata (recommended) or a known vector dimension (fallback), semantic queries can embed the query automatically:

mv.find("alpha", mode="sem")
mv.find("alpha", mode="sem", query_embedding_model="bge-small")  # override/force

Core API

Opening a memory

mv = use(kind, path, apikey=None, **options)
Parameter Type Description
kind str Adapter type: "basic", "langchain", "llamaindex", "openai", "crewai", "vercel-ai", "autogen"
path str Path to .mv2 file
mode str "open" (default), "create", or "auto"
enable_lex bool Enable lexical index (default: True)
enable_vec bool Enable vector index (default: False)
read_only bool Open in read-only mode (default: False)
lock_timeout_ms int Lock acquisition timeout in milliseconds (default: 250)
force str | None Set to "stale_only" to force-release stale locks
force_writable bool Open read-only first, then re-open writable (best-effort)

Context manager support:

with use("basic", "notes.mv2") as mv:
    mv.put(title="Note", label="general", metadata={}, text="Content")
# File handle automatically closed

Storing documents

frame_id = mv.put(
    title="Document title",
    label="category",
    metadata={"key": "value"},
    text="Document content...",
    uri="mv2://custom/path",
    tags=["tag1", "tag2"],
)

Batch ingestion

For bulk imports, put_many processes documents in parallel with 100x+ speedup over individual calls:

docs = [
    {"title": "Doc 1", "label": "news", "text": "First document..."},
    {"title": "Doc 2", "label": "news", "text": "Second document..."},
    # ... thousands more
]

frame_ids = mv.put_many(docs, opts={"compression_level": 3})
print(f"Ingested {len(frame_ids)} documents")

Searching

results = mv.find(query, k=5, snippet_chars=240, scope=None, mode=None)
Parameter Type Description
query str Search query
k int Number of results (default: 5)
snippet_chars int Snippet length (default: 240)
scope str Filter by URI prefix
mode str "auto", "lex", or "sem"

Semantic/hybrid options (when mode != "lex"):

Parameter Type Description
query_embedding list[float] | None Precomputed query embedding
query_embedding_model str | None Force embedding model for auto query embeddings
adaptive bool | None Enable adaptive retrieval cutoff
min_relevancy float | None Minimum relevancy (default: 0.5 when adaptive)
max_k int | None Maximum results (default: 100 when adaptive)
adaptive_strategy str | None One of: relative, absolute, cliff, elbow, combined

Retrieval-augmented generation

response = mv.ask(
    question,
    k=6,
    mode="auto",
    model="openai:gpt-4o-mini",
    api_key=os.environ.get("OPENAI_API_KEY"),
    context_only=False,
    mask_pii=False,
)
Parameter Type Description
question str Question to answer
k int Documents to retrieve (default: 6)
mode str "auto", "lex", or "sem"
model str LLM for synthesis (e.g., "openai:gpt-4o-mini", "nvidia:meta/llama3-8b-instruct")
api_key str API key for the LLM provider
context_only bool Skip synthesis, return context only
mask_pii bool Redact PII from response

Semantic/hybrid options (when mode != "lex"):

Parameter Type Description
query_embedding list[float] | None Precomputed query embedding
query_embedding_model str | None Force embedding model for auto query embeddings
adaptive bool | None Enable adaptive retrieval cutoff
min_relevancy float | None Minimum relevancy (default: 0.5 when adaptive)
max_k int | None Maximum results (default: 100 when adaptive)
adaptive_strategy str | None One of: relative, absolute, cliff, elbow, combined

Best practices: Adaptive retrieval

For best search quality, enable adaptive retrieval with the combined strategy. This dynamically adjusts result counts based on relevance scores rather than returning a fixed k:

# Recommended for find()
results = mv.find(
    "your query",
    mode="sem",  # or "auto"
    adaptive=True,
    adaptive_strategy="combined",
)

# Recommended for ask()
answer = mv.ask(
    "your question",
    mode="auto",
    adaptive=True,
    adaptive_strategy="combined",
)

The combined strategy uses both relative thresholds and score cliff detection to filter out low-relevance results, providing higher quality context for RAG applications.

Timeline queries

entries = mv.timeline(limit=100, since=1704067200, reverse=True)

Statistics

stats = mv.stats()
# {'frame_count': 42, 'size_bytes': 1048576, 'has_lex_index': True, ...}

File operations

# Commit pending changes
mv.seal()

# Explicit commit without sealing
mv.commit()

# Verify file integrity
report = mv.verify(deep=True)
report = Memvid.verify("notes.mv2", deep=True)  # also supported

# Repair indexes
result = mv.doctor(dry_run=True)
result = Memvid.doctor("notes.mv2", rebuild_time_index=True)

Diagnostics

from memvid_sdk import info, lock_who, lock_nudge, verify_single_file

print(info())                       # versions + enabled features
print(lock_who("notes.mv2"))        # lock status + owner (if locked)
print(lock_nudge("notes.mv2"))      # request stale lock release
verify_single_file("notes.mv2")     # ensure no sidecar files exist

Framework adapters

Adapters expose framework-native tools when the corresponding dependency is installed.

LangChain

mv = use("langchain", "notes.mv2")
tools = mv.tools  # List of StructuredTool instances

LlamaIndex

mv = use("llamaindex", "notes.mv2")
engine = mv.as_query_engine()
response = engine.query("What is the timeline?")

OpenAI function calling

mv = use("openai", "notes.mv2")
functions = mv.functions  # JSON schemas for tool_calls

CrewAI

mv = use("crewai", "notes.mv2")
tools = mv.tools  # CrewAI-compatible tools

Table extraction

Extract structured tables from PDFs:

result = mv.put_pdf_tables("report.pdf", embed_rows=True)
print(f"Extracted {result['tables_count']} tables")

tables = mv.list_tables()
data = mv.get_table(tables[0]["table_id"], format="dict")

Error handling

Typed exceptions with stable codes:

Code Exception Description
MV001 CapacityExceededError Storage capacity exceeded
MV002 TicketInvalidError Invalid ticket signature
MV003 TicketReplayError Ticket replay detected
MV004 LexIndexDisabledError Lexical index not enabled
MV005 TimeIndexMissingError Time index missing
MV006 VerifyFailedError Verification failed
MV007 LockedError File locked by another process
MV008 ApiKeyRequiredError API key required
MV009 MemoryAlreadyBoundError File bound to another memory
MV010 FrameNotFoundError Frame not found
MV011 VecIndexDisabledError Vector index not enabled
MV012 CorruptFileError Corrupt file / invalid TOC
MV013 FileNotFoundError File not found
MV014 VecDimensionMismatchError Vector dimension mismatch
MV015 EmbeddingFailedError Embedding failed
MV016 ClipIndexDisabledError CLIP index not enabled
MV017 NerModelNotAvailableError NER model not available
from memvid_sdk import CapacityExceededError

try:
    mv.put(...)
except CapacityExceededError:
    print("Out of storage capacity")

Environment variables

Variable Description
MEMVID_API_KEY API key for capacity beyond free tier
MEMVID_API_URL Control plane URL (for enterprise deployments)
MEMVID_OFFLINE Set to 1 to disable network providers and model downloads
MEMVID_MODELS_DIR Path to local embedding model cache (fastembed)
MEMVID_CLIP_MODEL Local CLIP model name (default: mobileclip-s2)
OPENAI_API_KEY API key for OpenAI embeddings (and can be used for ask() examples)
OPENAI_BASE_URL Optional OpenAI-compatible base URL for embeddings
NVIDIA_API_KEY API key for NVIDIA Integrate embeddings
NVIDIA_BASE_URL Optional NVIDIA Integrate base URL (default: https://integrate.api.nvidia.com)
NVIDIA_EMBEDDING_MODEL Optional NVIDIA embedding model override

Development

Build from source:

cd proprietary/memvid-bindings/python

# Create virtual environment
python -m venv .venv
source .venv/bin/activate

# Install dev dependencies
pip install maturin

# Build native extension
maturin develop --release --skip-install

# Run tests
./.venv/bin/python -m pytest -q

# Smoke-run offline-safe examples
./.venv/bin/python scripts/test_examples.py

# Optional: run gated embedding runtime tests
MEMVID_TEST_FASTEMBED=1 ./.venv/bin/python -m pytest -q tests/test_runtime_embeddings_gated.py
MEMVID_TEST_OPENAI=1 OPENAI_API_KEY=... ./.venv/bin/python -m pytest -q tests/test_runtime_embeddings_gated.py
MEMVID_TEST_NVIDIA=1 NVIDIA_API_KEY=... ./.venv/bin/python -m pytest -q tests/test_runtime_embeddings_gated.py

# Optional: run gated local model tests (requires `memvid models install ...`)
MEMVID_TEST_CLIP=1 ./.venv/bin/python -m pytest -q tests/test_phase2_models_tables.py
MEMVID_TEST_NER=1 ./.venv/bin/python -m pytest -q tests/test_phase2_models_tables.py

Requirements

  • Python 3.8+
  • macOS (Apple Silicon or Intel), Linux (x64), or Windows (x64)

License

Apache-2.0. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

memvid_sdk-2.0.140-cp38-abi3-win_amd64.whl (13.4 MB view details)

Uploaded CPython 3.8+Windows x86-64

memvid_sdk-2.0.140-cp38-abi3-manylinux_2_35_x86_64.whl (99.4 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.35+ x86-64

memvid_sdk-2.0.140-cp38-abi3-macosx_11_0_arm64.whl (63.9 MB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

memvid_sdk-2.0.140-cp38-abi3-macosx_10_12_x86_64.whl (66.0 MB view details)

Uploaded CPython 3.8+macOS 10.12+ x86-64

File details

Details for the file memvid_sdk-2.0.140-cp38-abi3-win_amd64.whl.

File metadata

  • Download URL: memvid_sdk-2.0.140-cp38-abi3-win_amd64.whl
  • Upload date:
  • Size: 13.4 MB
  • Tags: CPython 3.8+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for memvid_sdk-2.0.140-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 ba14c4484f55cc5e67cee2d52dc880d05fbe4143031048ac25e31236632fb4a6
MD5 17992e222387e55740897f69b468d397
BLAKE2b-256 0fe1eb4ade2e77e20d69521bebb3e23f5a1e533bb5920a07753474a8d6d93e38

See more details on using hashes here.

File details

Details for the file memvid_sdk-2.0.140-cp38-abi3-manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for memvid_sdk-2.0.140-cp38-abi3-manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 dc80d942b798df3e73d1059cd9cb7fa1f8a05200516948f5658fc2504a9d4606
MD5 e4397a0572b6ede599e7290b8fee9578
BLAKE2b-256 f90b968f08959a643b0c2f511b3f573f1d5b22ea3ba56659cf9ecf24f8b3e8ce

See more details on using hashes here.

File details

Details for the file memvid_sdk-2.0.140-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for memvid_sdk-2.0.140-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4f71e3431741770b4c25cc7c86738e4f5bdb2eb38cdc272436c24d2dbf22cfe7
MD5 10966815e84f501e5355f08dbcf67137
BLAKE2b-256 1c31a1e7a544ea90822dd88c23c8e5f5c63aec39a38a2b74b1fbdb9e6fb09d29

See more details on using hashes here.

File details

Details for the file memvid_sdk-2.0.140-cp38-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for memvid_sdk-2.0.140-cp38-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 18f74113527d0a98f9b20dc86474c889ac25a0e3ad232c8d09df177d589b76e4
MD5 688ae35c4a89325f91ff30fb400f7811
BLAKE2b-256 ddf2bb43e969bb937edb577212618b52510941d81dab0e42460d5e5b3bc7ae64

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page