Zero-cloud local vector memory CLI — Ollama embeddings + Qdrant
Project description
local-vector-memory
Zero-cloud, local-first vector memory CLI. Powered by Ollama embeddings + Qdrant.
100% local, 100% free, supports Chinese out of the box.
Why?
Most vector memory solutions require cloud APIs (OpenAI, Pinecone, etc.). This one runs entirely on your machine — perfect for privacy-first setups, air-gapped environments, or just saving money.
Features
- 🔒 100% local — Ollama embeddings, local Qdrant file storage
- 🇨🇳 Chinese-first — defaults to
qwen3-embedding:4b(2560d, best Chinese accuracy) - ⚡ Fast — ~230ms/query on M1 Mac
- 📦 Zero cloud deps — no API keys, no Docker, no signup
- 🔄 Auto reindex — point at your markdown files, rebuild index in seconds
- 🎯 Accurate — 100% Top-3 hit rate in real-world tests
Quick Start
Prerequisites
# Install Ollama (https://ollama.com)
curl -fsSL https://ollama.com/install.sh | sh
# Pull embedding model
ollama pull qwen3-embedding:4b
# Install qdrant-client
pip install qdrant-client requests
Install
pip install local-vector-memory
Usage
# Initialize (first time)
lvm init
# Add a memory
lvm add "OpenClaw baseUrl must be http://localhost:11434 without /v1"
# Search
lvm search "how to fix baseUrl"
lvm search "baseUrl配置" --limit 3
# Reindex markdown files
lvm reindex --dir ~/notes --glob "**/*.md"
# List stats
lvm stats
Configuration
Environment variables (or .env file):
| Variable | Default | Description |
|---|---|---|
LVM_OLLAMA_URL |
http://localhost:11434 |
Ollama API URL |
LVM_MODEL |
qwen3-embedding:4b |
Embedding model |
LVM_DIMS |
2560 |
Vector dimensions (model-dependent) |
LVM_DB_PATH |
~/.local-vector-memory/qdrant |
Qdrant storage path |
LVM_COLLECTION |
memory |
Qdrant collection name |
LVM_CHUNK_SIZE |
400 |
Text chunk size (chars) |
LVM_CHUNK_OVERLAP |
50 |
Overlap between chunks |
Embedding Model Comparison
Tested on Chinese memory queries (M1 Mac, 16GB):
| Model | Dimensions | Size | Hit Rate (Top-3) | Speed |
|---|---|---|---|---|
qwen3-embedding:4b |
2560 | ~2.5GB | 100% ✅ | 232ms |
bge-m3 |
1024 | ~570MB | 40% | 180ms |
nomic-embed-text |
768 | 274MB | 30% | 150ms |
Recommendation: qwen3-embedding:4b for Chinese/English mixed content.
Architecture
Your .md files → chunking → Ollama embed → Qdrant (local file) → cosine search
No Docker. No cloud. No API keys. Just local files + Ollama.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file local_vector_memory-0.1.0.tar.gz.
File metadata
- Download URL: local_vector_memory-0.1.0.tar.gz
- Upload date:
- Size: 10.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
10a814d5820342d7538a76badccea59ac6ee5e4845ea072a22a018b37989835d
|
|
| MD5 |
b433fe9850037980659201f8421d19f0
|
|
| BLAKE2b-256 |
945d69d7eae860b2a443fc1559ec662d3f554df25bfee73b8cdac8fb2ba12e60
|
File details
Details for the file local_vector_memory-0.1.0-py3-none-any.whl.
File metadata
- Download URL: local_vector_memory-0.1.0-py3-none-any.whl
- Upload date:
- Size: 8.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e3a80e0d9d3afa8b42b32f32ca3c2e411625ed58c3a1be6e441f69e4f023b519
|
|
| MD5 |
f1e29bbe56eedf626f659303f914ca7e
|
|
| BLAKE2b-256 |
bcf33a4fdaa3b49e6d875b2e0a586cd809a03dd52948587e31404e073c664fe4
|