Skip to main content

Zero-cloud local vector memory CLI — Ollama embeddings + Qdrant

Project description

local-vector-memory

Zero-cloud, local-first vector memory CLI. Powered by Ollama embeddings + Qdrant.

100% local, 100% free, supports Chinese out of the box.

Why?

Most vector memory solutions require cloud APIs (OpenAI, Pinecone, etc.). This one runs entirely on your machine — perfect for privacy-first setups, air-gapped environments, or just saving money.

Features

  • 🔒 100% local — Ollama embeddings, local Qdrant file storage
  • 🇨🇳 Chinese-first — defaults to qwen3-embedding:4b (2560d, best Chinese accuracy)
  • Fast — ~230ms/query on M1 Mac
  • 📦 Zero cloud deps — no API keys, no Docker, no signup
  • 🔄 Auto reindex — point at your markdown files, rebuild index in seconds
  • 🎯 Accurate — 100% Top-3 hit rate in real-world tests

Quick Start

Prerequisites

# Install Ollama (https://ollama.com)
curl -fsSL https://ollama.com/install.sh | sh

# Pull embedding model
ollama pull qwen3-embedding:4b

# Install qdrant-client
pip install qdrant-client requests

Install

pip install local-vector-memory

Usage

# Initialize (first time)
lvm init

# Add a memory
lvm add "OpenClaw baseUrl must be http://localhost:11434 without /v1"

# Search
lvm search "how to fix baseUrl"
lvm search "baseUrl配置" --limit 3

# Reindex markdown files
lvm reindex --dir ~/notes --glob "**/*.md"

# List stats
lvm stats

Configuration

Environment variables (or .env file):

Variable Default Description
LVM_OLLAMA_URL http://localhost:11434 Ollama API URL
LVM_MODEL qwen3-embedding:4b Embedding model
LVM_DIMS 2560 Vector dimensions (model-dependent)
LVM_DB_PATH ~/.local-vector-memory/qdrant Qdrant storage path
LVM_COLLECTION memory Qdrant collection name
LVM_CHUNK_SIZE 400 Text chunk size (chars)
LVM_CHUNK_OVERLAP 50 Overlap between chunks

Embedding Model Comparison

Tested on Chinese memory queries (M1 Mac, 16GB):

Model Dimensions Size Hit Rate (Top-3) Speed
qwen3-embedding:4b 2560 ~2.5GB 100% 232ms
bge-m3 1024 ~570MB 40% 180ms
nomic-embed-text 768 274MB 30% 150ms

Recommendation: qwen3-embedding:4b for Chinese/English mixed content.

Architecture

Your .md files → chunking → Ollama embed → Qdrant (local file) → cosine search

No Docker. No cloud. No API keys. Just local files + Ollama.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

local_vector_memory-0.1.0.tar.gz (10.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

local_vector_memory-0.1.0-py3-none-any.whl (8.5 kB view details)

Uploaded Python 3

File details

Details for the file local_vector_memory-0.1.0.tar.gz.

File metadata

  • Download URL: local_vector_memory-0.1.0.tar.gz
  • Upload date:
  • Size: 10.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for local_vector_memory-0.1.0.tar.gz
Algorithm Hash digest
SHA256 10a814d5820342d7538a76badccea59ac6ee5e4845ea072a22a018b37989835d
MD5 b433fe9850037980659201f8421d19f0
BLAKE2b-256 945d69d7eae860b2a443fc1559ec662d3f554df25bfee73b8cdac8fb2ba12e60

See more details on using hashes here.

File details

Details for the file local_vector_memory-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for local_vector_memory-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e3a80e0d9d3afa8b42b32f32ca3c2e411625ed58c3a1be6e441f69e4f023b519
MD5 f1e29bbe56eedf626f659303f914ca7e
BLAKE2b-256 bcf33a4fdaa3b49e6d875b2e0a586cd809a03dd52948587e31404e073c664fe4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page