Single-file AI memory system for Python. Store, search, and query documents with built-in RAG.

These details have not been verified by PyPI

Project links

Project description

memvid-sdk

Single-file AI memory system for Python. Store documents, search with BM25 ranking, and run retrieval-augmented generation (RAG) queries from a portable .mv2 file.

Built on Rust with PyO3 bindings. No database setup, no network dependencies, no configuration files.

Install

pip install memvid-sdk

Optional extras for framework integrations:

pip install "memvid-sdk[langchain]"     # LangChain tools
pip install "memvid-sdk[llamaindex]"    # LlamaIndex query engine
pip install "memvid-sdk[openai]"        # OpenAI function schemas
pip install "memvid-sdk[crewai]"        # CrewAI tools
pip install "memvid-sdk[full]"          # All integrations

Quick start

from memvid_sdk import use, create

# Create a new memory file
mv = use("basic", "notes.mv2", mode="auto")

# Store a document
mv.put(
    title="Project kickoff",
    label="meeting",
    metadata={"date": "2024-01-15"},
    text="Discussed timeline, assigned tasks to team members.",
)

# Search by keyword
results = mv.find("timeline")
print(results["hits"])

# Ask a question (retrieves relevant context)
answer = mv.ask("What was discussed in the kickoff?")
print(answer["context"])

# Commit changes
mv.seal()

Embeddings & semantic search

Semantic search (mode="sem") and hybrid search (mode="auto") require a vector index:

from memvid_sdk import create

mv = create("notes.mv2", enable_vec=True)

Generate embeddings during ingestion (local or OpenAI)

mv.put(
    title="Doc",
    label="note",
    metadata={},
    text="alpha fastembed test",
    enable_embedding=True,
    embedding_model="bge-small",  # local (fastembed)
)

Batch mode:

mv.put_many(
    [{"title": "Doc", "label": "note", "text": "alpha"}],
    opts={"enable_embedding": True, "embedding_model": "bge-small"},
)

Supported embedding_model values: bge-small, bge-base, nomic, gte-large, openai-small, openai-large, openai-ada.

Bring your own embeddings (precomputed)

Store embedding identity metadata so semantic queries can auto-detect the right model later:

mv.put_many(
    [{"title": "Doc", "label": "note", "text": "alpha"}],
    embeddings=[[0.1, 0.2, 0.3, 0.4]],
    embedding_identity={"provider": "custom", "model": "my-embedder-v1"},
)

Use an embedder (SDK-managed query embeddings)

from memvid_sdk.embeddings import HashEmbeddings

embedder = HashEmbeddings(dimension=32)  # deterministic offline embedder for tests
mv.put_many(
    [{"title": "Doc", "label": "note", "text": "alpha"}],
    embedder=embedder,
)

mv.find("alpha", mode="sem", embedder=embedder)
mv.ask("alpha", mode="sem", context_only=True, embedder=embedder)

Built-in embedders: OpenAIEmbeddings (OPENAI_API_KEY), CohereEmbeddings (COHERE_API_KEY), VoyageEmbeddings (VOYAGE_API_KEY), NvidiaEmbeddings (NVIDIA_API_KEY), HuggingFaceEmbeddings (local), HashEmbeddings (offline deterministic).

from memvid_sdk.embeddings import NvidiaEmbeddings

embedder = NvidiaEmbeddings(model="nvidia/nv-embed-v1")  # uses NVIDIA_API_KEY
mv.put_many([{"title": "Doc", "label": "note", "text": "alpha"}], embedder=embedder)
mv.find("alpha", mode="sem", embedder=embedder)

Auto query embeddings (no `query_embedding` required)

If the memory contains memvid.embedding.* identity metadata (recommended) or a known vector dimension (fallback), semantic queries can embed the query automatically:

mv.find("alpha", mode="sem")
mv.find("alpha", mode="sem", query_embedding_model="bge-small")  # override/force

Core API

Opening a memory

mv = use(kind, path, apikey=None, **options)

Parameter	Type	Description
`kind`	`str`	Adapter type: `"basic"`, `"langchain"`, `"llamaindex"`, `"openai"`, `"crewai"`, `"vercel-ai"`, `"autogen"`
`path`	`str`	Path to `.mv2` file
`mode`	`str`	`"open"` (default), `"create"`, or `"auto"`
`enable_lex`	`bool`	Enable lexical index (default: `True`)
`enable_vec`	`bool`	Enable vector index (default: `False`)
`read_only`	`bool`	Open in read-only mode (default: `False`)
`lock_timeout_ms`	`int`	Lock acquisition timeout in milliseconds (default: `250`)
`force`	`str \| None`	Set to `"stale_only"` to force-release stale locks
`force_writable`	`bool`	Open read-only first, then re-open writable (best-effort)

Context manager support:

with use("basic", "notes.mv2") as mv:
    mv.put(title="Note", label="general", metadata={}, text="Content")
# File handle automatically closed

Storing documents

frame_id = mv.put(
    title="Document title",
    label="category",
    metadata={"key": "value"},
    text="Document content...",
    uri="mv2://custom/path",
    tags=["tag1", "tag2"],
)

Batch ingestion

For bulk imports, put_many processes documents in parallel with 100x+ speedup over individual calls:

docs = [
    {"title": "Doc 1", "label": "news", "text": "First document..."},
    {"title": "Doc 2", "label": "news", "text": "Second document..."},
    # ... thousands more
]

frame_ids = mv.put_many(docs, opts={"compression_level": 3})
print(f"Ingested {len(frame_ids)} documents")

Searching

results = mv.find(query, k=5, snippet_chars=240, scope=None, mode=None)

Parameter	Type	Description
`query`	`str`	Search query
`k`	`int`	Number of results (default: `5`)
`snippet_chars`	`int`	Snippet length (default: `240`)
`scope`	`str`	Filter by URI prefix
`mode`	`str`	`"auto"`, `"lex"`, or `"sem"`

Semantic/hybrid options (when mode != "lex"):

Parameter	Type	Description
`query_embedding`	`list[float] \| None`	Precomputed query embedding
`query_embedding_model`	`str \| None`	Force embedding model for auto query embeddings
`adaptive`	`bool \| None`	Enable adaptive retrieval cutoff
`min_relevancy`	`float \| None`	Minimum relevancy (default: `0.5` when adaptive)
`max_k`	`int \| None`	Maximum results (default: `100` when adaptive)
`adaptive_strategy`	`str \| None`	One of: `relative`, `absolute`, `cliff`, `elbow`, `combined`

Retrieval-augmented generation

response = mv.ask(
    question,
    k=6,
    mode="auto",
    model="openai:gpt-4o-mini",
    api_key=os.environ.get("OPENAI_API_KEY"),
    context_only=False,
    mask_pii=False,
)

Parameter	Type	Description
`question`	`str`	Question to answer
`k`	`int`	Documents to retrieve (default: `6`)
`mode`	`str`	`"auto"`, `"lex"`, or `"sem"`
`model`	`str`	LLM for synthesis (e.g., `"openai:gpt-4o-mini"`, `"nvidia:meta/llama3-8b-instruct"`)
`api_key`	`str`	API key for the LLM provider
`context_only`	`bool`	Skip synthesis, return context only
`mask_pii`	`bool`	Redact PII from response

Semantic/hybrid options (when mode != "lex"):

Parameter	Type	Description
`query_embedding`	`list[float] \| None`	Precomputed query embedding
`query_embedding_model`	`str \| None`	Force embedding model for auto query embeddings
`adaptive`	`bool \| None`	Enable adaptive retrieval cutoff
`min_relevancy`	`float \| None`	Minimum relevancy (default: `0.5` when adaptive)
`max_k`	`int \| None`	Maximum results (default: `100` when adaptive)
`adaptive_strategy`	`str \| None`	One of: `relative`, `absolute`, `cliff`, `elbow`, `combined`

Best practices: Adaptive retrieval

For best search quality, enable adaptive retrieval with the combined strategy. This dynamically adjusts result counts based on relevance scores rather than returning a fixed k:

# Recommended for find()
results = mv.find(
    "your query",
    mode="sem",  # or "auto"
    adaptive=True,
    adaptive_strategy="combined",
)

# Recommended for ask()
answer = mv.ask(
    "your question",
    mode="auto",
    adaptive=True,
    adaptive_strategy="combined",
)

The combined strategy uses both relative thresholds and score cliff detection to filter out low-relevance results, providing higher quality context for RAG applications.

Timeline queries

entries = mv.timeline(limit=100, since=1704067200, reverse=True)

Statistics

stats = mv.stats()
# {'frame_count': 42, 'size_bytes': 1048576, 'has_lex_index': True, ...}

File operations

# Commit pending changes
mv.seal()

# Explicit commit without sealing
mv.commit()

# Verify file integrity
report = mv.verify(deep=True)
report = Memvid.verify("notes.mv2", deep=True)  # also supported

# Repair indexes
result = mv.doctor(dry_run=True)
result = Memvid.doctor("notes.mv2", rebuild_time_index=True)

Diagnostics

from memvid_sdk import info, lock_who, lock_nudge, verify_single_file

print(info())                       # versions + enabled features
print(lock_who("notes.mv2"))        # lock status + owner (if locked)
print(lock_nudge("notes.mv2"))      # request stale lock release
verify_single_file("notes.mv2")     # ensure no sidecar files exist

Framework adapters

Adapters expose framework-native tools when the corresponding dependency is installed.

LangChain

mv = use("langchain", "notes.mv2")
tools = mv.tools  # List of StructuredTool instances

LlamaIndex

mv = use("llamaindex", "notes.mv2")
engine = mv.as_query_engine()
response = engine.query("What is the timeline?")

OpenAI function calling

mv = use("openai", "notes.mv2")
functions = mv.functions  # JSON schemas for tool_calls

CrewAI

mv = use("crewai", "notes.mv2")
tools = mv.tools  # CrewAI-compatible tools

Table extraction

Extract structured tables from PDFs:

result = mv.put_pdf_tables("report.pdf", embed_rows=True)
print(f"Extracted {result['tables_count']} tables")

tables = mv.list_tables()
data = mv.get_table(tables[0]["table_id"], format="dict")

Error handling

Typed exceptions with stable codes:

Code	Exception	Description
MV001	`CapacityExceededError`	Storage capacity exceeded
MV002	`TicketInvalidError`	Invalid ticket signature
MV003	`TicketReplayError`	Ticket replay detected
MV004	`LexIndexDisabledError`	Lexical index not enabled
MV005	`TimeIndexMissingError`	Time index missing
MV006	`VerifyFailedError`	Verification failed
MV007	`LockedError`	File locked by another process
MV008	`ApiKeyRequiredError`	API key required
MV009	`MemoryAlreadyBoundError`	File bound to another memory
MV010	`FrameNotFoundError`	Frame not found
MV011	`VecIndexDisabledError`	Vector index not enabled
MV012	`CorruptFileError`	Corrupt file / invalid TOC
MV013	`FileNotFoundError`	File not found
MV014	`VecDimensionMismatchError`	Vector dimension mismatch
MV015	`EmbeddingFailedError`	Embedding failed
MV016	`ClipIndexDisabledError`	CLIP index not enabled
MV017	`NerModelNotAvailableError`	NER model not available

from memvid_sdk import CapacityExceededError

try:
    mv.put(...)
except CapacityExceededError:
    print("Out of storage capacity")

Environment variables

Variable	Description
`MEMVID_API_KEY`	API key for capacity beyond free tier
`MEMVID_API_URL`	Control plane URL (for enterprise deployments)
`MEMVID_OFFLINE`	Set to `1` to disable network providers and model downloads
`MEMVID_MODELS_DIR`	Path to local embedding model cache (fastembed)
`MEMVID_CLIP_MODEL`	Local CLIP model name (default: `mobileclip-s2`)
`OPENAI_API_KEY`	API key for OpenAI embeddings (and can be used for `ask()` examples)
`OPENAI_BASE_URL`	Optional OpenAI-compatible base URL for embeddings
`NVIDIA_API_KEY`	API key for NVIDIA Integrate embeddings
`NVIDIA_BASE_URL`	Optional NVIDIA Integrate base URL (default: `https://integrate.api.nvidia.com`)
`NVIDIA_EMBEDDING_MODEL`	Optional NVIDIA embedding model override

Development

Build from source:

cd proprietary/memvid-bindings/python

# Create virtual environment
python -m venv .venv
source .venv/bin/activate

# Install dev dependencies
pip install maturin

# Build native extension
maturin develop --release --skip-install

# Run tests
./.venv/bin/python -m pytest -q

# Smoke-run offline-safe examples
./.venv/bin/python scripts/test_examples.py

# Optional: run gated embedding runtime tests
MEMVID_TEST_FASTEMBED=1 ./.venv/bin/python -m pytest -q tests/test_runtime_embeddings_gated.py
MEMVID_TEST_OPENAI=1 OPENAI_API_KEY=... ./.venv/bin/python -m pytest -q tests/test_runtime_embeddings_gated.py
MEMVID_TEST_NVIDIA=1 NVIDIA_API_KEY=... ./.venv/bin/python -m pytest -q tests/test_runtime_embeddings_gated.py

# Optional: run gated local model tests (requires `memvid models install ...`)
MEMVID_TEST_CLIP=1 ./.venv/bin/python -m pytest -q tests/test_phase2_models_tables.py
MEMVID_TEST_NER=1 ./.venv/bin/python -m pytest -q tests/test_phase2_models_tables.py

Requirements

Python 3.8+
macOS (Apple Silicon or Intel), Linux (x64), or Windows (x64)

License

Apache-2.0. See LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.0.159

Mar 13, 2026

2.0.158

Mar 3, 2026

2.0.157

Feb 15, 2026

2.0.156

Feb 7, 2026

2.0.153

Jan 27, 2026

2.0.152

Jan 25, 2026

2.0.151

Jan 17, 2026

2.0.149

Jan 16, 2026

2.0.148

Jan 10, 2026

2.0.147

Jan 9, 2026

2.0.144

Jan 5, 2026

2.0.142

Jan 3, 2026

2.0.141

Jan 3, 2026

2.0.140

Jan 3, 2026

This version

2.0.132

Jan 2, 2026

2.0.131

Jan 1, 2026

2.0.130

Dec 25, 2025

2.0.129

Dec 25, 2025

2.0.124

Dec 24, 2025

2.0.123

Dec 17, 2025

2.0.112

Dec 17, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

memvid_sdk-2.0.132-cp38-abi3-win_amd64.whl (14.3 MB view details)

Uploaded Jan 2, 2026 CPython 3.8+Windows x86-64

memvid_sdk-2.0.132-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (61.1 MB view details)

Uploaded Jan 2, 2026 CPython 3.8+manylinux: glibc 2.17+ x86-64

memvid_sdk-2.0.132-cp38-abi3-macosx_11_0_x86_64.whl (16.1 MB view details)

Uploaded Jan 2, 2026 CPython 3.8+macOS 11.0+ x86-64

memvid_sdk-2.0.132-cp38-abi3-macosx_11_0_arm64.whl (14.4 MB view details)

Uploaded Jan 2, 2026 CPython 3.8+macOS 11.0+ ARM64

File details

Details for the file memvid_sdk-2.0.132-cp38-abi3-win_amd64.whl.

File metadata

Download URL: memvid_sdk-2.0.132-cp38-abi3-win_amd64.whl
Upload date: Jan 2, 2026
Size: 14.3 MB
Tags: CPython 3.8+, Windows x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for memvid_sdk-2.0.132-cp38-abi3-win_amd64.whl
Algorithm	Hash digest
SHA256	`f1d539dbe025980e5c2e5e491c0814305e7d05669df04e2e7ab27161f919d741`
MD5	`f3e22c6426455d916d4d74f0a25e23ad`
BLAKE2b-256	`3fb751bc80246cbe1e1576eee214f9ab9a35ad111201983beb2bfebefa97f10a`

See more details on using hashes here.

File details

Details for the file memvid_sdk-2.0.132-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

Download URL: memvid_sdk-2.0.132-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Upload date: Jan 2, 2026
Size: 61.1 MB
Tags: CPython 3.8+, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for memvid_sdk-2.0.132-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`90f63c8bac5f602d22add2c7fad9bc18e500fe62540b6872d2153619ab4fdbf6`
MD5	`7b511f56171373e0b3f4b97911c9a3a6`
BLAKE2b-256	`f0cd10157363c4089ecc5aa8c923bc1827601113e093febdde8f46b245c92f75`

See more details on using hashes here.

File details

Details for the file memvid_sdk-2.0.132-cp38-abi3-macosx_11_0_x86_64.whl.

File metadata

Download URL: memvid_sdk-2.0.132-cp38-abi3-macosx_11_0_x86_64.whl
Upload date: Jan 2, 2026
Size: 16.1 MB
Tags: CPython 3.8+, macOS 11.0+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for memvid_sdk-2.0.132-cp38-abi3-macosx_11_0_x86_64.whl
Algorithm	Hash digest
SHA256	`94b077693bffa7e74dc07ce7d1f595292e0d8c83379d44f9d3ba9939e822d1b6`
MD5	`16757c80c279e0d7e9be0323e6cbcf14`
BLAKE2b-256	`aefc31dfd2721631367c35e00783b2b30ce1f3afe7dd84179b006ceba9aecd0a`

See more details on using hashes here.

File details

Details for the file memvid_sdk-2.0.132-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

Download URL: memvid_sdk-2.0.132-cp38-abi3-macosx_11_0_arm64.whl
Upload date: Jan 2, 2026
Size: 14.4 MB
Tags: CPython 3.8+, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for memvid_sdk-2.0.132-cp38-abi3-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`b0ef183bf25da3649408ef36f11b1cc656560e16e05615fe74e62b03e06fb348`
MD5	`c21f58997f30e58841417640eea753ad`
BLAKE2b-256	`d96bdf18d885c94f99625984358f1d3a00d5c6d8aa89ae7809fceadce895c5f0`

See more details on using hashes here.

memvid-sdk 2.0.132

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

memvid-sdk

Install

Quick start

Embeddings & semantic search

Generate embeddings during ingestion (local or OpenAI)

Bring your own embeddings (precomputed)

Use an embedder (SDK-managed query embeddings)

Auto query embeddings (no query_embedding required)

Core API

Opening a memory

Storing documents

Batch ingestion

Searching

Retrieval-augmented generation

Best practices: Adaptive retrieval

Timeline queries

Statistics

File operations

Diagnostics

Framework adapters

LangChain

LlamaIndex

OpenAI function calling

CrewAI

Table extraction

Error handling

Environment variables

Development

Requirements

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distributions

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

Auto query embeddings (no `query_embedding` required)