Single-file AI memory system for Python. Store, search, and query documents with built-in RAG.

These details have not been verified by PyPI

Project links

Project description

memvid-sdk

A single-file AI memory system for Python. Store documents, search with BM25 + vector ranking, and run RAG queries from a portable .mv2 file.

Built on Rust with PyO3 bindings. No database setup, no external services required.

Install

pip install memvid-sdk

For framework integrations:

pip install "memvid-sdk[langchain]"    # LangChain tools
pip install "memvid-sdk[llamaindex]"   # LlamaIndex query engine
pip install "memvid-sdk[openai]"       # OpenAI function schemas
pip install "memvid-sdk[full]"         # All integrations

Quick Start

from memvid_sdk import create

# Create a memory file
mv = create("notes.mv2")

# Store some documents
mv.put(
    title="Project Update",
    label="meeting",
    text="Discussed Q4 roadmap. Alice will handle the frontend refactor.",
    metadata={"date": "2024-01-15", "attendees": ["Alice", "Bob"]}
)

mv.put(
    title="Technical Decision",
    label="architecture",
    text="Decided to use PostgreSQL for the main database. Redis for caching.",
)

# Search by keyword
results = mv.find("database")
for hit in results["hits"]:
    print(f"{hit['title']}: {hit['snippet']}")

# Ask a question
answer = mv.ask("What database are we using?", model="openai:gpt-4o-mini")
print(answer["text"])

# Close the file
mv.seal()

Core API

Opening and Creating

from memvid_sdk import create, use

# Create a new memory file
mv = create("notes.mv2")

# Open an existing file
mv = use("basic", "notes.mv2", mode="open")

# Create or open (auto mode)
mv = use("basic", "notes.mv2", mode="auto")

# Open read-only
mv = use("basic", "notes.mv2", read_only=True)

# Context manager (auto-closes)
with use("basic", "notes.mv2") as mv:
    mv.put(title="Note", label="general", text="Content here")

Storing Documents

# Store text content
mv.put(
    title="Meeting Notes",
    label="meeting",
    text="Discussed the new API design.",
    metadata={"date": "2024-01-15", "priority": "high"},
    tags=["api", "design", "q1"]
)

# Store a file (PDF, DOCX, TXT, etc.)
mv.put(
    title="Q4 Report",
    label="reports",
    file="./documents/q4-report.pdf"
)

# Store with both text and file
mv.put(
    title="Contract Summary",
    label="legal",
    text="Key terms: 2-year agreement, auto-renewal clause.",
    file="./contracts/agreement.pdf"
)

Batch Ingestion

For large imports, put_many is significantly faster:

documents = [
    {"title": "Doc 1", "label": "notes", "text": "First document content..."},
    {"title": "Doc 2", "label": "notes", "text": "Second document content..."},
    # ... thousands more
]

frame_ids = mv.put_many(documents)
print(f"Added {len(frame_ids)} documents")

Searching

# Lexical search (BM25 ranking)
results = mv.find("machine learning", k=10)

for hit in results["hits"]:
    print(f"{hit['title']}: {hit['snippet']}")

Search parameters:

Parameter	Type	Description
`k`	int	Number of results (default: 5)
`snippet_chars`	int	Snippet length (default: 240)
`mode`	str	`"lex"`, `"sem"`, or `"auto"`
`scope`	str	Filter by URI prefix

Semantic Search

Semantic search requires embeddings. Generate them during ingestion:

# Using local embeddings (bge-small, nomic, etc.)
mv.put(
    title="Document",
    text="Content here...",
    enable_embedding=True,
    embedding_model="bge-small"
)

# Using OpenAI embeddings
mv.put(
    title="Document",
    text="Content here...",
    enable_embedding=True,
    embedding_model="openai-small"  # requires OPENAI_API_KEY
)

Then search semantically:

results = mv.find("neural networks", mode="sem")

Windows users: Local embedding models (bge-small, nomic, etc.) are not available on Windows due to ONNX runtime limitations. Use OpenAI embeddings instead by setting OPENAI_API_KEY.

Question Answering (RAG)

# Basic RAG query
answer = mv.ask("What did we decide about the database?")
print(answer["text"])

# With specific model
answer = mv.ask(
    "Summarize the meeting notes",
    model="openai:gpt-4o-mini",
    k=6  # number of documents to retrieve
)

# Get context only (no LLM synthesis)
context = mv.ask("What was discussed?", context_only=True)
print(context["context"])  # Retrieved document snippets

Timeline and Stats

# Get recent entries
entries = mv.timeline(limit=20)

# Get statistics
stats = mv.stats()
print(f"Documents: {stats['frame_count']}")
print(f"Size: {stats['size_bytes']} bytes")

Closing

Always close the memory when done:

mv.seal()

Or use a context manager for automatic cleanup.

External Embeddings

For more control over embeddings, use external providers:

from memvid_sdk import create
from memvid_sdk.embeddings import OpenAIEmbeddings

# Create memory with vector index enabled
mv = create("knowledge.mv2", enable_vec=True, enable_lex=True)

# Initialize embedding provider
embedder = OpenAIEmbeddings(model="text-embedding-3-small")

# Prepare documents
documents = [
    {"title": "ML Basics", "label": "ai", "text": "Machine learning enables systems to learn from data."},
    {"title": "Deep Learning", "label": "ai", "text": "Deep learning uses neural networks with multiple layers."},
]

# Generate embeddings
texts = [doc["text"] for doc in documents]
embeddings = embedder.embed_documents(texts)

# Store documents with pre-computed embeddings
frame_ids = mv.put_many(documents, embeddings=embeddings)

# Search using external embeddings
query = "neural networks"
query_embedding = embedder.embed_query(query)
results = mv.find(query, k=3, query_embedding=query_embedding, mode="sem")

for hit in results["hits"]:
    print(f"{hit['title']}: {hit['score']:.3f}")

Built-in providers:

OpenAIEmbeddings (requires OPENAI_API_KEY)
CohereEmbeddings (requires COHERE_API_KEY)
VoyageEmbeddings (requires VOYAGE_API_KEY)
NvidiaEmbeddings (requires NVIDIA_API_KEY)
GeminiEmbeddings (requires GOOGLE_API_KEY or GEMINI_API_KEY)
MistralEmbeddings (requires MISTRAL_API_KEY)
HuggingFaceEmbeddings (local, no API key)

Use the factory function for quick setup:

from memvid_sdk.embeddings import get_embedder

# Create any supported provider
embedder = get_embedder("openai")  # or "cohere", "voyage", "nvidia", "gemini", "mistral", "huggingface"

Framework Integrations

LangChain

mv = use("langchain", "notes.mv2")
tools = mv.tools  # List of StructuredTool instances

LlamaIndex

mv = use("llamaindex", "notes.mv2")
engine = mv.as_query_engine()
response = engine.query("What is the timeline?")

OpenAI Function Calling

mv = use("openai", "notes.mv2")
functions = mv.functions  # JSON schemas for tool_calls

CrewAI

mv = use("crewai", "notes.mv2")
tools = mv.tools  # CrewAI-compatible tools

Error Handling

Typed exceptions for programmatic handling:

from memvid_sdk import CapacityExceededError, LockedError, EmbeddingFailedError

try:
    mv.put(title="Doc", text="Content")
except CapacityExceededError:
    print("Storage capacity exceeded")
except LockedError:
    print("File is locked by another process")
except EmbeddingFailedError:
    print("Embedding generation failed")

Common exceptions:

Code	Exception	Description
MV001	`CapacityExceededError`	Storage capacity exceeded
MV007	`LockedError`	File locked by another process
MV010	`FrameNotFoundError`	Frame not found
MV013	`FileNotFoundError`	File not found
MV015	`EmbeddingFailedError`	Embedding failed

Environment Variables

Variable	Description
`OPENAI_API_KEY`	For OpenAI embeddings and LLM synthesis
`OPENAI_BASE_URL`	Custom OpenAI-compatible endpoint
`NVIDIA_API_KEY`	For NVIDIA NIM embeddings
`MEMVID_MODELS_DIR`	Local embedding model cache directory
`MEMVID_API_KEY`	For capacity beyond the free tier
`MEMVID_OFFLINE`	Set to `1` to disable network features

Platform Support

Platform	Architecture	Local Embeddings
macOS	ARM64 (Apple Silicon)	Yes
macOS	x64 (Intel)	Yes
Linux	x64 (glibc)	Yes
Windows	x64	No (use OpenAI)

Requirements

Python 3.8 or later
For local embeddings: macOS or Linux (Windows requires OpenAI)

More Information

Documentation: https://docs.memvid.com
GitHub: https://github.com/memvid/memvid
Discord: https://discord.gg/2mynS7fcK7
Website: https://memvid.com

License

Apache-2.0

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

2.0.159

Mar 13, 2026

2.0.158

Mar 3, 2026

2.0.157

Feb 15, 2026

2.0.156

Feb 7, 2026

2.0.153

Jan 27, 2026

2.0.152

Jan 25, 2026

2.0.151

Jan 17, 2026

2.0.149

Jan 16, 2026

This version

2.0.148

Jan 10, 2026

2.0.147

Jan 9, 2026

2.0.144

Jan 5, 2026

2.0.142

Jan 3, 2026

2.0.141

Jan 3, 2026

2.0.140

Jan 3, 2026

2.0.132

Jan 2, 2026

2.0.131

Jan 1, 2026

2.0.130

Dec 25, 2025

2.0.129

Dec 25, 2025

2.0.124

Dec 24, 2025

2.0.123

Dec 17, 2025

2.0.112

Dec 17, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

memvid_sdk-2.0.148.tar.gz (7.3 MB view details)

Uploaded Jan 10, 2026 Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

memvid_sdk-2.0.148-cp38-abi3-win_amd64.whl (13.5 MB view details)

Uploaded Jan 10, 2026 CPython 3.8+Windows x86-64

memvid_sdk-2.0.148-cp38-abi3-manylinux_2_35_x86_64.whl (99.4 MB view details)

Uploaded Jan 10, 2026 CPython 3.8+manylinux: glibc 2.35+ x86-64

memvid_sdk-2.0.148-cp38-abi3-macosx_11_0_arm64.whl (63.9 MB view details)

Uploaded Jan 10, 2026 CPython 3.8+macOS 11.0+ ARM64

memvid_sdk-2.0.148-cp38-abi3-macosx_10_12_x86_64.whl (66.2 MB view details)

Uploaded Jan 10, 2026 CPython 3.8+macOS 10.12+ x86-64

File details

Details for the file memvid_sdk-2.0.148.tar.gz.

File metadata

Download URL: memvid_sdk-2.0.148.tar.gz
Upload date: Jan 10, 2026
Size: 7.3 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for memvid_sdk-2.0.148.tar.gz
Algorithm	Hash digest
SHA256	`40aeddbeb0c3b4aee5571da9440e4cdbdf4a03a64b11b9cb82d5dfca8ed1220f`
MD5	`c92ffce7cee0e6de1feeaef0fc224070`
BLAKE2b-256	`928e8b5278e74708ca4787101c59ff1f576831e50766db1fec315a97548fdcd5`

See more details on using hashes here.

File details

Details for the file memvid_sdk-2.0.148-cp38-abi3-win_amd64.whl.

File metadata

Download URL: memvid_sdk-2.0.148-cp38-abi3-win_amd64.whl
Upload date: Jan 10, 2026
Size: 13.5 MB
Tags: CPython 3.8+, Windows x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for memvid_sdk-2.0.148-cp38-abi3-win_amd64.whl
Algorithm	Hash digest
SHA256	`93b675ded847e6597e35df9a1923bc8b7fa453efd619c2a9f171efa348357a50`
MD5	`6601daf4af275a346b168d5f85812dc6`
BLAKE2b-256	`056d66dd51bc23f67b36995833758e0e86e5fcb4037fa5ddb63053a9cea6864a`

See more details on using hashes here.

File details

Details for the file memvid_sdk-2.0.148-cp38-abi3-manylinux_2_35_x86_64.whl.

File metadata

Download URL: memvid_sdk-2.0.148-cp38-abi3-manylinux_2_35_x86_64.whl
Upload date: Jan 10, 2026
Size: 99.4 MB
Tags: CPython 3.8+, manylinux: glibc 2.35+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for memvid_sdk-2.0.148-cp38-abi3-manylinux_2_35_x86_64.whl
Algorithm	Hash digest
SHA256	`b95a62bd2095a2a29a5cd0c7d222dbefa5b5dbc31eb643b7320d2f171c766085`
MD5	`f9e8e4255c61161cf15870dd3f08f40a`
BLAKE2b-256	`b18d032ec2790f054b99e8b0e3b80b29eea27d40113b63c40c0bbe2704155786`

See more details on using hashes here.

File details

Details for the file memvid_sdk-2.0.148-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

Download URL: memvid_sdk-2.0.148-cp38-abi3-macosx_11_0_arm64.whl
Upload date: Jan 10, 2026
Size: 63.9 MB
Tags: CPython 3.8+, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for memvid_sdk-2.0.148-cp38-abi3-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`aa98de3a1bca23a4f9c2fd32827954adf8944a664c9b703120e8b217622846ca`
MD5	`4b90d94b5b2cf642fa32243103c2c2cf`
BLAKE2b-256	`1e540e1ab690e53173083da27b4b22e4cce9ccd65cd0134cefe2730a08a40d3b`

See more details on using hashes here.

File details

Details for the file memvid_sdk-2.0.148-cp38-abi3-macosx_10_12_x86_64.whl.

File metadata

Download URL: memvid_sdk-2.0.148-cp38-abi3-macosx_10_12_x86_64.whl
Upload date: Jan 10, 2026
Size: 66.2 MB
Tags: CPython 3.8+, macOS 10.12+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for memvid_sdk-2.0.148-cp38-abi3-macosx_10_12_x86_64.whl
Algorithm	Hash digest
SHA256	`17ff57d449e3070f722423ec2902d12609e63ebc3b75b97dc0fa4e83d3e05577`
MD5	`e1da6fb58a1a32aa0c18acabbac96e9c`
BLAKE2b-256	`eab3b72bb6a884ea4c1f628274a97ed322f17eb485a5ba5dfdcda50767a77180`

See more details on using hashes here.

memvid-sdk 2.0.148

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

memvid-sdk

Install

Quick Start

Core API

Opening and Creating

Storing Documents

Batch Ingestion

Searching

Semantic Search

Question Answering (RAG)

Timeline and Stats

Closing

External Embeddings

Framework Integrations

LangChain

LlamaIndex

OpenAI Function Calling

CrewAI

Error Handling

Environment Variables

Platform Support

Requirements

More Information

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distributions

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes